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Sir: 

Enclosed please find a copy of amendments to the claims under Article 1 9 which were 
received by the International Bureau on December 22, 2000, and an English translation of 
the amended claims. 

Based upon the submission of the amended sheets of claims, Applicants respectfully 
request examination on the merits of the application containing amended pages of claims 1- 
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IN THE CLAIMS 



Please amend the claims as follows (a marked-up copy of the claim amendments is 
provided as an attachment to this Amendment): 

2. (Amended) The speech encoder according to claim 1, wherein when one or more 
states of a previous subframe are an extremely large value or an extremely small value, said 
prediction coefficient adjusting means adjusts said prediction coefficients so as to reduce the 
influence thereof. 



38. (Amended) A radio communication system that connects the communication base 
station according to claim 36 with a communication terminal via a radio network. 



Entry of the foregoing amendment to the claims is respectfully requested prior to 
examination and calculation of the filing fees in the above-identified patent application. 

Should there be any questions, the Examiner is invited to contact the undersigned at 
the below listed number. 



REMARKS 



Respectfully submitted, 
Kazutoshi YASUNAGA et al. 



April 20, 2001 

GREENBLUM & BERNSTEIN, P.L.C. 
1941 Roland Clarke Place 
Reston, VA 20191 
(703)716-1191 
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MARKED-UP COPY OF AMENDED CLAIMS 
2. (Amended) The speech encoder according to claim 1, wherein when one [ore] or 
more states of [aprevious subframes] a previous subframe are an extremely large value or an 
extremely small value, said prediction coefficient adjusting means adjusts said prediction 
coefficients so as to reduce the influence thereof. 

38. (Amended) A radio communication system that connects the communication base 
station according to claim 36 with [the] a communication terminal [according to claim 37] 
via a radio network. 
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DESCRIPTION 



APPARATUS AND METHOD FOR SPEECH CODING 



5 Technical Field 

The present invention relates to an apparatus and 
method for speech coding used in a digital communication 
system . 

10 Background Art 

In the field of digital mobile communication such 
as cellular telephones, there is a demand for a low bit 
rate speech compression coding method to cope with an 
increasing number of subscribers, and various research 

15 organizations are carrying forward research and 
development focused on this method. 

In Japan, a coding method called "VSELP" with a bit 
rate of 11.2 kbps developed by Motorola, Inc. is used 
as a standard coding system for digital cellular 

20 telephones and digital cellular telephones using this 
system are on sale in Japan since the fall of 1994. 

Furthermore, a coding system called "PSI-CELP" 
with a bit rate of 5.6 kbps developed by NTT Mobile 
Communications Network, Inc. is now commercialized. 

25 These systems are the improved versions of a system 
called "CELP" (described in "Code Excited Linear 
Prediction: M.R. Schroeder "High Quality Speech at Low 
Bit Rates", Proc.ICASSP '85, pp. 937-940). 
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This CELP system is characterized by adopting a 
method (A-b-S: Analysis by Synthesis) consisting of 
separating speech into excitation information and vocal 
tract information, coding the excitation information 
5 using indices of a plurality of excitation samples stored 
in a codebook, while coding LPC (linear prediction 
coefficients) for the vocal tract information and making 
a comparison with input speech taking into consideration 
the vocal tract information during coding of the 

10 excitation information. 

In this CELP system, an autocorrelation analysis 
and LPC analysis are conducted on the input speech data 
(input speech) to obtain LPC coefficients and the LPC 
coefficients obtained are coded to obtain an LPC code. 

15 The LPC code obtained is decoded to obtain decoded LPC 
coefficients. On the other hand, the input speech is 
assigned perceptual weight by a perceptual weighting 
filter using the LPC coefficients. 

Two synthesized speeches are obtained by applying 

20 filtering to respective code vectors of excitation 

samples stored in an adaptive codebook and stochastic 
codebook (referred to as "adaptive code vector' 7 (or 
adaptive excitation) and "stochastic code vector" (or 
stochastic excitation), respectively) using the 

25 obtained decoded LPC coefficients. 

Then, a relationship between the two synthesized 
speeches obtained and the perceptual weighted input 
speech is analyzed, optimal values (optimal gains) of 
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the two synthesized speeches are obtained, the power of 
the synthesized speeches is adjusted according to the 
optimal gains obtained and an overall synthesized speech 
is obtained by adding up the respective synthesized 
5 speeches. Then, coding distortion between the overall 
synthesized speech obtained and the input speech is 
calculated. In this way, coding distortion between the 
overall synthesized speech and input speech is 
calculated for all possible excitation samples and the 

10 indexes of the excitation samples (adaptive excitation 
sample and stochastic excitation sample) corresponding 
to the minimum coding distortion are identified as the 
coded excitation samples. 

The gains and indexes of the excitation samples 

15 calculated in this way are coded and these coded gains 
and the indexes of the coded excitation samples are sent 
together with the LPC code to the transmission path. 
Furthermore, an actual excitation signal is created from 
two excitations corresponding to the gain code and 

20 excitation sample index, these are stored in the adaptive 
codebook and at the same time the old excitation sample 
is discarded. 

By the way, excitation searches for the adaptive 
codebook and for the stochastic codebook are generally 

25 carried out on a subf rame-basis , where subframe is a 
subdivision of an analysis frame. Coding of gains 
(gain quantization) is performed by vector quantization 
(VQ) that evaluates quantization distortion of the gains 



using two synthesized speeches corresponding to the 
excitation sample indexes. 

In this algorithm, a vector codebook is created 
beforehand which stores a plurality of typical samples 
(code vectors) of parameter vectors. Then, coding 
distortion between the perceptual weighted input speech 
and a perceptual weighted LPC synthesis of the adaptive 
excitation vector and of the stochastic excitation 
vector is calculated using gain code vectors stored in 
the vector codebook from the following expression 1: 

/ 2 
En = ^(Xi - gnxAi-hnx Si) 

Expression 1 

where : 

E n : Coding distortion when nth gain code vector is 

used 

X L : Perceptual weighted speech 

Ai : Perceptual weighted LPC synthesis of adaptive 
code vector 

Si : Perceptual weighted LPC synthesis of stochastic 
code vector 

g n : Code vector element (gain on adaptive 
excitation side) 

h n : Code vector element (gain on stochastic 
excitation side) 

n : Code vector number 

i : Excitation data index 

I : Subframe length (coding unit of input 



speech) Then, distortion E n when each code vector is 

used by controlling the vector codebook is compared and 
the number of the code vector with the least distortion 
is identified as the gain vector code. Furthermore, the 
number of the code vector with the least distortion is 
found from among all the possible code vectors stored 
in the vector codebook and identified to be the vector 
code . 

Expression 1 above seems to require many 
computational complexity for every n, but since the sum 
of products on i can be calculated beforehand, it is 
possible to search n with a small amount of computat ionak 
complexity . 

On the other hand, by determining a code vector 
based on the transmitted code of the vector, a speech 
decoder (decoder) decodes coded data and obtains a code 
vector . 

Moreover, further improvements have been made over 
the prior art based on the above algorithm. For example, 
taking advantage of the fact that the human perceptual 
characteristic to sound intensity is found to have 
logarithmic scale, power is logarithmically expressed 
and quantized, and two gains normalized with that power 
is subjected to VQ . This method is used in the Japan PDC 
half rate CODEC standard system. There is also a method 
of coding using inter-frame correlations of gain 
parameters (predictive coding). This method is used in 
the ITU-T international standard G.729. However, even 
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these improvements are unable to attain performance to 
a sufficient degree* 

Gain information coding methods using the human 
perceptual characteristic to sound intensity and 
inter-frame correlations have been developed so far, 
providing more efficient coding performance of gain 
information. Especially, predictive quantization has 
drastically improved the performance, but the 
conventional method performs predictive quantization 
using the same values as those of previous subframes as 
state values. However, some of the values stored as 
state values are extremely large (small) and using those 
values for the next subframe may prevent the next 
subframe from being quantized correctly, resulting in 
local abnormal sounds . 

Disclosure of Invention 

It is an object of the present invention to provide 
a CELP type speech encoder and encoding method capable 
of performing speech encoding using predictive 
quantization with less including local abnormal sounds. 

A subject of the present invention is to prevent 
local abnormal sounds by automatically adjusting 
prediction coefficients when the state value in a 
preceding subframe is an extremely large value or 
extremely small value in predictive quantization. 

Brief Description of Drawings 
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FIG.l is a block diagram showing a configuration 
of a radio communication apparatus equipped with a speech 
coder/decoder of the present invention; 

FIG. 2 is a block diagram showing a configuration 
5 of the speech encoder according to Embodiment 1 of the 
present invention ; 

FIG, 3 is a block diagram showing a configuration 
of a gain calculation section of the speech encoder shown 
in FIG. 2; 

10 FIG. 4 is a block diagram showing a configuration 

of a parameter coding section of the speech encoder shown 
in FIG. 2; 

FIG. 5 is a block diagram showing a configuration 
of a speech decoder for decoding speech data coded by 
15 the speech encoder according to Embodiment 1 of the 
present invention ; 

FIG. 6 is a drawing to explain an adaptive codebook 
search ; 

FIG. 7 is a block diagram showing a configuration 
20 of a speech encoder according to Embodiment 2 of the 
present invention ; 

FIG. 8 is a block diagram to explain a 
dispersed -pulse codebook; 

FIG. 9 is a block diagram showing an example of a 
25 detailed configuration of the dispersed-pulse codebook; 

FIG. 10 is a block diagram showing an example of a 
detailed configuration of the dispersed-pulse 
codebook; 
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FIG. 11 is a block diagram showing a configuration 
of a speech encoder according to Embodiment 3 of the 
present invention ; 

FIG. 12 is a block diagram showing a configuration 
5 of a speech decoder for decoding speech data coded by 
the speech coder according to Embodiment 3 of the present 
invention ; 

FIG.13A illustrates an example of a dispersed- 
pulse codebook used in the speech encoder according to 
10 Embodiment 3 of the present invention; 

FIG.13B illustrates an example of the 
dispersed-pulse codebook used in the speech decoder 
according to Embodiment 3 of the present invention; 

FIG.14A illustrates an example of the 
15 dispersed-pulse codebook used in the speech encoder 
according to Embodiment 3 of the present invention; and 

FIG.14B illustrates an example of the 
dispersed-pulse codebook used in the speech decoder 
according to Embodiment 3 of the present invention. 

20 

Best Mode for Carrying out the Invention 

With reference now to the attached drawings, 

embodiments of the present invention will be explained 

in detail below. 
25 (Embodiment 1) 

FIG.l is a block diagram showing a configuration 

of a radio communication apparatus equipped with a speech 

encoder /decoder according to Embodiments 1 to 3 of the 
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present invention . 

On the transmitting side of this radio 
communication apparatus, a speech is converted to an 
electric analog signal by speech input apparatus 11 such 
5 as a microphone and output to A/D converter 12. The 
analog speech signal is converted to a digital speech 
signal by A/D converter 12 and output to speech encoding 
section 13, Speech encoding section 13 performs speech 
encoding processing on the digital speech signal and 
10 outputs the coded information to 

modulation/demodulation section 14 . 

Modulation/demodulation section 14 digital -modulates 
the coded speech signal and sends to radio transmission 
section 15, Radio transmission section 15 performs 

15 predetermined radio transmission processing on the 
modulated signal. This signal is transmitted via 
antenna 16. Processor 21 performs processing using data 
stored in RAM 22 and ROM 23 as appropriate. 

On the other hand, on the receiving side of the radio 

20 communication apparatus, a reception signal received 
through antenna 16 is subjected to predetermined radio 
reception processing by radio reception section 17 and 
sent to modulation/demodulation section 14. 
Modulation/demodulation section 14 performs 

25 demodulation processing on the reception signal and 
outputs the demodulated signal to speech decoding 
section 18. Speech decoding section 18 performs 
decoding processing on the demodulated signal to obtain 
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a digital decoded speech signal and outputs the digital 
decoded speech signal to D/A converter 19 . D/A converter 
19 converts the digital decoded speech signal output from 
speech decoding section 18 to an analog decoded speech 
5 signal and outputs to speech output apparatus 20 such 
as a speaker. Finally, speech output apparatus 20 
converts the electric analog decoded speech signal to 
a decoded speech and outputs the decoded speech. 

Here, speech encoding section 13 and speech 

10 decoding section 18 are operated by processor 21 such 
as DSP using codebooks stored in RAM 22 and ROM 23 . These 
operation programs are stored in ROM 23. 

FIG. 2 is a block diagram showing a configuration 
of a CELP type speech encoder according to Embodiment 

15 1 of the present invention. This speech encoder is 

included in speech encoding section 13 shown in FIG.l. 
Adaptive codebook 103 shown in FIG. 2 is stored in RAM 
22 shown in FIG.l and stochastic codebook 104 shown in 
FIG. 2 is stored in ROM 23 shown in FIG.l. 

20 In the speech encoder in FIG. 2, LPC analysis section 

102 performs an autocorrelation analysis and LPC 
analysis on speech data 101 and obtains LPC coefficients. 
Furthermore, LPC analysis section 102 performs encoding 
of the obtained LPC coefficients to obtain an LPC code. 

25 Furthermore, LPC analysis section 102 decodes the 

obtained LPC code and obtains decoded LPC coefficients. 
Speech data 101 input is sent to perceptual weighting 
section 107 and assigned perceptual weight using a 
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perceptual weighting filter using the LPC coefficients 
above . 

Then, excitation vector generator 105 extracts an 
excitation vector sample (adaptive code vector or 
5 adaptive excitation) stored in adaptive codebook 103 and 
an excitation vector sample (stochastic code vector or 
adaptive excitation) stored in stochastic codebook 104 
and sends their respective code vectors to perceptual 
weighted LPC synthesis filter 106. Furthermore, 

10 perceptual weighted LPC synthesis filter 106 performs 
filtering on the two excitation vectors obtained from 
excitation vector generator 105 using the decoded LPC 
coefficients obtained from LPC analysis section 102 and 
obtains two synthesized speeches. 

15 Perceptual weighted LPC synthesis filter 106 uses 

a perceptual weighting filter using the LPC coefficients, 
high frequency enhancement filter and long-term 
prediction coefficient (obtained by carrying out a 
long-term prediction analysis of the input speech) 

20 together and thereby performs a perceptual weighted LPC 
synthesis on their respective synthesized speeches. 

Perceptual weighted LPC synthesis filter 106 
outputs the two synthesized speeches to gain calculation 
section 108. Gain calculation section 108 has a 

25 configuration shown in FIG. 3. Gain calculation section 
108 sends the two synthesized speeches obtained from 
perceptual weighted LPC synthesis filter 106 and the 
perceptual weighted input speech to analysis section 
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1081 and analyzes the relationship between the two 
synthesized speeches and input speech to obtain optimal 
values (optimal gains ) for the two synthesized speeches . 
This optimal gains are output to power adjustment section 
5 1082. 

Power adjustment section 1082 adjusts the two 
synthesized speeches with the optimal gains obtained* 
The power-adjusted synthesized speeches are output to 
synthesis section 1083 and added up there to become an 

10 overall synthesized speech. This overall synthesized 
speech is output to coding distortion calculation 
section 1084. Coding distortion calculation section 
1084 finds coding distortion between the overall 
synthesized speech obtained and input speech. 

15 Coding distortion calculation section 1084 

controls excitation vector generator 105 to output all 
possible excitation vector samples of adaptive codebook 
103 and of stochastic codebook 104, finds coding 
distortion between the overall synthesized speech and 

20 input speech on all excitation vector samples and 

identifies the respective indexes of the respective 
excitation vector samples corresponding to the minimum 
coding distortion . 

Then, analysis section 1081 sends the indexes of 

25 the excitation vector samples, the two perceptual 
weighted LPC synthesized excitation vectors 
corresponding to the respective indexes and input speech 
to parameter coding section 109. 
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Parameter coding section 109 obtains a gain code 
by coding the gains and sends the LPC code, indexes of 
the excitation vector samples all together to the 
transmission path. Furthermore, parameter coding 
5 section 109 creates an actual excitation vector signal 
from the gain code and two excitation vectors 
corresponding to the respective indexes and stores the 
excitation vector into the adaptive codebook 103 and at 
the same time discards the old excitation vector sample 

10 in the adaptive codebook. By the way, an excitation 
vector search for the adaptive codebook and an excitation 
vector search for the stochastic codebook are generally 
performed on a subframe basis, where "subframe" is a 
subdivision of an processing frame ( analys is frame). 

15 Here, the operation of gain encoding of parameter 

coding section 109 of the speech encoder in the above 
configuration will be explained. FIG. 4 is a block 
diagram showing a configuration of the parameter coding 
section of the speech encoder of the present invention. 

20 In FIG. 4, perceptual weighted input speech (X ± ), 

perceptual weighted LPC synthesized adaptive code vector 
(Ai) and perceptual weighted LPC synthesized stochastic 
code vector (Si) are sent to parameter calculation 
section 1091. Parameter calculation section 1091 

25 calculates parameters necessary for a coding distortion 
calculation. The parameters calculated by parameter 
calculation section 1091 are output to coding distortion 
calculation section 1092 and the coding distortion is 
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calculated there. This coding distortion is output to 
comparison section 1093. Comparison section 1093 
controls coding distortion calculation section 1092 and 
vector codebook 1094 to obtain the most appropriate code 
5 from the obtained coding distortion and outputs the code 
vector (decoded vector) obtained from vector codebook 
1094 based on this code to decoded vector storage section 
1096 and updates decoded vector storage section 1096. 
Prediction coefficients storage section 1095 

10 stores prediction coefficients used for predictive 
coding. This prediction coefficients are output to 
parameter calculation section 1091 and coding distortion 
calculation section 1092 to be used for parameter 
calculations and coding distortion calculations. 

15 Decoded vector storage section 1096 stores the states 
for predictive coding. These states are output to 
parameter calculation section 1091 to be used for 
parameter calculations. Vector codebook 1094 stores 
code vectors. 

20 Then, the algorithm of the gain coding method 

according to the present invention will be explained. 

Vector codebook 1094 is created beforehand, which 
stores a plurality of typical samples (code vectors) of 
quantization target vectors. Each vector consists of 
25 three elements; AC gain, logarithmic value of SC gain, 
and an adjustment coefficient for prediction 
coefficients of logarithmic value of SC gain. 

This adjustment coefficient is a coefficient to 
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adjust prediction coefficients according to a states of 
previous subframes. More specif ically , when a state of 
a previous subframe is an extremely large value or an 
extremely small value, this adjustment coefficient is 
5 set so as to reduce that influence. It is possible to 
calculate this adjustment coefficient using a training 
algorithm developed by the present inventor, et al. using 
many vector samples. Here, explanations of this 
training algorithm are omitted. 
10 For example, a large value is set for the adjustment 

coefficient in a code vector frequently used for voiced 
sound segments. That is, when a same waveform is 
repeated in series, the reliability of the states of the 
previous subframes is high, and therefore a large 
15 adjustment coefficient is set so that the large 

prediction coefficients of the previous subframes can 
be used. This allows more efficient prediction. 

On the other hand, a small value is set for the 
adjustment coefficient in a code vector less frequently 
20 used at the onset segments, etc. That is, when the 

waveform is quite different from the previous waveform, 
the reliability of the states of the previous subframes 
is low (the adaptive codebook is considered not to 
function), and therefore a small value is set for the 
25 adjustment coefficient so as to reduce the influence of 
the prediction coefficients of the previous subframes. 
This prevents any detrimental effect on the next 
prediction, making it possible to implement satisfactory 
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predictive coding . 

In this way, adjusting prediction coefficients 
according to code vectors of states makes it possible 
to further improve the performance of predictive coding 
5 so far. 

Prediction coefficients for predictive coding are 
stored in prediction coefficient storage section 1095. 
These prediction coefficients are prediction 
coefficients of MA {Moving Average) and two types of 

10 prediction coefficients, AC and SC , are stored by the 
number corresponding to the prediction order. These 
prediction coefficients are generally calculated 
through training based on a huge amount of sound database 
beforehand. Moreover, values indicating silent states 

15 are stored in decoded vector storage section 1096 as the 
initial values. 

Then, the coding method will be explained in detail 
below. First, a perceptual weighted input speech (X ± ), 
perceptual weighted LPC synthesized adaptive code vector 

20 (A ± ) and perceptual weighted LPC synthesized stochastic 
code vector (Si) are sent to parameter calculation 
section 1091 and furthermore the decoded vector (AC, SC, 
adjustment coefficient) stored in decoded vector storage 
section 1096 and the prediction coefficients (AC, SC) 

25 stored in prediction coefficient storage section 1095 
are sent. Parameters necessary for a coding distortion 
calculation are calculated using these values and 
vectors . 
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A coding distortion calculation by coding 
distortion calculation section 1092 is performed 
according to expression 2 below: 

i 2 
En = g(Xi-GanxAi-GsnxSi) 

Expression 2 

where : 

G an , G sn : Decoded gain 

E n : Coding distortion when nth gain code vector is 

used 

Xi : Perceptual weighted speech 

Ai : Perceptual weighted LPC synthesized adaptive 
code vector 

Si : Perceptual weighted LPC synthesized stochastic 
code vector 

n : Code vector number 

i : Excitation vector index 

I : Subframe length (coding unit of input speech) 
In order to reduce the amount of calculation/ 
parameter calculation section 1091 calculates the part 
independent of the code vector number. What should be 
calculated are correlations between three synthesized 
speeches (X if A if S £ ) and powers. These calculations are 
performed according to expression 3 below: 
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/ 

Dxx = ^XixXi 

i 

Dxa = ^Xi xAi x 2 
Dxs = ^ Xi x Si x 2 

i-0 

Daa = \* Ai x Az 
Das = VAix x 2 

7 

i-0 

Expression 3 

where : 

Dxx*D xa ,D xs ,D aa/ D as ,D ss s Correlation value between 
5 synthesized speeches, power 

X ± : Perceptual weighted speech 

Ai : Perceptual weighted LPC synthesized adaptive 
code vector 

Si : Perceptual weighted LPC synthesized stochastic 
1 0 code vector 

n : Code vector number 

i : Excitation vector index 

I : Subframe length (coding unit of input speech) 
Furthermore, parameter calculation section 1091 
15 calculates three predictive values shown in expression 
4 below using past code vectors stored in decoded vector 
storage section 1096 and prediction coefficients stored 
in prediction coefficient storage section 1095. 
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M 

Pra = ^ cxm x Sam 

771 = 0 
A/ 

Pr s « ^ x x 

/re = 0 
M 

Psc = ^ /?m x Scm 

Expression 4 

where : 

P ra : Predictive value (AC gain) 
P rs : Predictive value ( SC gain) 

P sc : Predictive value (prediction coefficient) 
a m : Prediction coefficient (AC gain, fixed value) 
/5 m : Prediction coefficient (SC gain, fixed value) 
S am : State (element of past code vector, AC gain) 
S sm : State (element of past code vector, SC gain) 
S cm : State (element of past code vector, SC 
prediction coefficient adjustment coefficient) 
m: Predictive index 
M: Prediction order 

As is apparent from expression 4 above, with regard 
to P rs and P sc , adjustment coefficients are multiplied 
unlike the conventional art. Therefore, regarding the 
predictive value and prediction coefficient of an SC gain, 
when a value of a state in the previous subframe is 
extremely large or extremely small, it is possible to 
alleviate the influence (reduce the influence) by means 
of the adjustment coefficient. That is, it is possible 
to adaptively change the predictive value and prediction 
coefficients of the SC gain according to the states. 
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Then, coding distortion calculation section 1092 
calculates coding distortion using the parameters 
calculated by parameter calculation section 1091, the 
prediction coefficients stored in prediction 
5 coefficient storage section 1095 and the code vectors 
stored in vector codebook 1094 according to expression 
5 below: 

En = Dxx + (Gan) 2 xDaa + (Gsn) 2 xDss - Gan xDxa 
- Gsn x Dxs + Gan x Gsn x Das 

Gan = Pr a + (1 - Pac)x Can 
Gsn -10 A {Pr s + (1- Psc)xCsn} 

Expression 5 

10 where: 

E n : Coding distortion when nth gain code vector is 

used 

D xx/ D xa/ D xs/ D aa , D as ,D ss : Correlation value between 
synthesized speeches, power 
15 G an/ G sn : Decoded gain 

P ra : Predictive value (AC gain) 
P rs : Predictive value (SC gain) 

P ac : Sum of prediction coefficients (fixed value) 
P sc : Sum of prediction coefficients ( calculated by 
20 expression 4 above) 

C an , C sn , C cn : Code vector, C cn is a prediction 
coefficient adjustment coefficient, but not used here 
n: Code vector number 

D xx is actually independent of code vector number 
25 n, and the addition of D xx can be omitted. 
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Then, comparison section 1093 controls vector 
codebook 1094 and coding distortion calculation section 
1092 and finds the code vector number corresponding to 
the minimum coding distortion calculated by coding 
5 distortion calculation section 1092 from among a 

plurality of code vectors stored in vector codebook 1094 
and identifies this as the gain code. Furthermore, the 
content of decoded vector storage section 1096 is updated 
using the gain code obtained. The update is performed 
10 according to expression 6 below: 
Sam-Sam- l(m=M~ l) ? Sa0=CaJ 
Ssm=Ssm-l(m=M~l),SsO=CsJ 
Scm=Ssm-l(m=M~l),ScO=CcJ 

Expression 6 

15 where: 

S am / s sm/ S cm : State vector (AC, SC, prediction 
coefficient adjustment coefficient) 
m: Predictive index 
M: Prediction order 
20 J: Code obtained from comparison section 

As is apparent from Expression 4 to Expression 6, 
in this embodiment, decoded vector storage section 1096 
stores state vector S cm and prediction coefficients are 
adaptively controlled using these prediction 
25 coefficient adjustment coefficients. 

FIG. 5 shows a block diagram showing a configuration 
of the speech decoder according to this embodiment of 
the present invention. This speech decoder is included 
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in speech decoding section 18 shown in FIG . 1 • By the way, 
adaptive codebook 202 in FIG. 5 is stored in RAM 22 in 
FIG.l and stochastic codebook 203 in FIG. 5 is stored in 
ROM 23 in FIG.l. 
5 In the speech decoder in FIG. 5, parameter decoding 

section 201 obtains the respective excitation vector 
sample codes of respective excitation vector codebooks 
(adaptive codebook 2 0 2 , stochastic codebook 203), LPC 
codes and gain codes from the transmission path. 

10 Parameter decoding section 201 then obtains decoded LPC 
coefficients from the LPC code and obtains decoded gains 
from the gain code. 

Then, excitation vector generator 204 obtains 
decoded excitation vectors by multiplying the 

15 respective excitation vector samples by the decoded 

gains and adding up the multiplication results. In this 
case, the decoded excitation vector obtained are stored 
in adaptive codebook 204 as excitation vector samples 
and at the same time the old excitation vector samples 

20 are discarded. Then, LPC synthesis section 205 obtains 
a synthesized speech by filtering the decoded excitation 
vector with the decoded LPC coefficients. 

The two excitation codebooks are the same as those 
included in the speech encoder in FIG. 2 (reference 

25 numerals 103 and 104 in FIG. 2) and the sample numbers 
(codes for the adaptive codebook and codes for the 
stochastic codebook) to extract the excitation vector 
samples are supplied from parameter decoding section 
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201 . 

Thus, the speech encoder of this embodiment can 
control prediction coefficients according to each code 
vector, providing more efficient prediction more 
5 adaptable to local characteristic of speech, thus making 
it possible to prevent detrimental effects on prediction 
in the non-stationary segment and attain special effects 
that have not been attained by conventional arts . 
( Embodiment 2 ) 

10 As described above, the gain calculation section 

in the speech encoder compares synthesized speeches and 
input speeches of all possible excitation vectors in the 
adaptive codebook and in the stochastic codebook 
obtained from the excitation vector generator- At this 

15 time, two excitation vectors (adaptive codebook vector 
and stochastic codebook vector) are generally searched 
in an open-loop for the consideration of the amount of 
computational complexity. This will be explained with 
reference to FiG.2 below. 

20 In this open-loop search, excitation vector 

generator 105 selects excitation vector candidates only 
from adaptive codebook 103 one after another, makes 
perceptual weighted LPC synthesis filter 106 function 
to obtain a synthesized speech and send to gain 

25 calculation section 108, compares the synthesized speech 
and input speech and selects an optimal code of adaptive 
codebook 10 3. 

Then, excitation vector generator 105 fixes the 
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code of adaptive codebook 103 above, selects the same 
excitation vector from adaptive codebook 103 and selects 
excitation vectors corresponding to gain calculation 
section 108 one after another from stochastic codebook 
5 104 and sends to perceptual weighted LPC synthesis filter 
106. Gain calculation section 108 compares the sum of 
both synthesized speeches and the input speech to 
determine the code of stochastic codebook 104. 

When this algorithm is used, the coding performance 

10 deteriorates slightly compared to searching codes of all 
codebooks respectively, but the amount of computational 
complexity is reduced drastically. For this reason, 
this open-loop search is generally used. 

Here, a typical algorithm in a conventional 

15 open-loop excitation vector search will be explained. 
Here, the excitation vector search procedure when one 
analysis section (frame) is composed of two subframes 
will be explained. 

First, upon reception of an instruction from gain 

20 calculation section 108, excitation vector generator 105 
extracts an excitation vector from adaptive codebook 103 
and sends to perceptual weighted LPC synthesis filter 
106. Gain calculation section 108 repeatedly compares 
the synthesized excitation vector and the input speech 

25 of the first subframe to find an optimal code. Here, the 
features of the adaptive codebook will be shown. The 
adaptive codebook consists of excitation vectors past 
used for speech synthesis. A code corresponds to a time 



lag as shown in FIG. 6. 

Then, after a code of adaptive codebook 103 is 
determined, a search for the stochastic codebook is 
started. Excitation vector generator 105 extracts the 
5 excitation vector of the code obtained from the search 
of the adaptive codebook 103 and the excitation vector 
of the stochastic codebook 104 specified by gain 
calculation section 108 and sends these excitation 
vectors to perceptual weighted LPC synthesis filter 106. 

10 Then, gain calculation section 108 calculates coding 
distortion between the perceptual weighted synthesis 
speech and perceptual weighted input speech and 
determines an optimal (whose square error becomes a 
minimum) code of stochastic excitation vector 104. The 

15 procedure for an excitation vector code search in one 
analysis section (in the case of two subframes) is shown 
below. 

1) Determines the code of the adaptive codebook of 
the first subframe. 
20 2 ) Determines the code of the stochastic codebook 

of the first subframe. 

3) Parameter coding section 109 codes gains, 
generates the excitation vector of the first subframe 
with decoded gains and updates adaptive codebook 103. 
25 4 ) Determines the code of the adaptive codebook of 

the second subframe. 

5) Determines the code of the stochastic codebook 
of the second subframe. 
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6) Parameter coding section 109 codes the gains, 
generates the excitation vector of the second subframe 
with decoded gain and updates adaptive codebook 103. 
The algorithm above allows efficient coding of 
5 excitation vectors. However, an effort has been 

recently developedfor decreasing the number of bits of 
excitation vectors aiming at a further reduction of the 
bit rate. What receives special attention is an 
algorithm of reducing the number of bits by taking 
10 advantage of the presence of a large correlation in a 
lag of the adaptive codebook and narrowing the search 
range of the second subframe to the range close to the 
lag of the first subframe (reducing the number of 
entries) while leaving the code of the first subframe 
15 as it is . 

With this recently developed algorithm, local 
deterioration may be provoked, in the case speech signal 
in an analysis segment (frame) has a large change, or 
in the case the characteristics of the consecutive two 
20 frames are much different 

This embodiment provides a speech encoder that 
implements a search method of calculating correlation 
values by performing a pitch analysis for two subframes 
respectively, before starting coding and determining the 
25 range of searching a lag between two subframes based on 
the correlation values obtained. 

More specifically, the speech encoder of this 
embodiment is a CELP type encoder that breaks down one 
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frame into a plurality of subf rames and codes respective 
frames, characterized by comprising a pitch analysis 
section that performs a pitch analysis of a plurality 
of subframes in the processing frame respectively, and 
5 calculates correlation values before searching the first 
subframe in the adaptive codebook and a search range 
setting section that while the pitch analysis section 
calculates correlation values of a plurality of 
subframes in the processing frame respectively, finds 

10 the value most likely to be the pitch cycle (typical 
pitch) on each subframe from the size of the correlation 
values and determines the search range of a lag between 
a plurality of subframes based on the correlation values 
obtained by the pitch analysis section and the typical 

15 pitch. Then, the search range setting section of this 
speech encoder determines a provisional pitch that 
becomes the center of the search range using the typical 
pitch of a plurality of subframes obtained by the pitch 
analysis section and the correlation value and the search 

20 range setting section sets the lag search range in a 
specified range around the determined provisional pitch 
and sets the search range before and after the 
provisional pitch when the lag search range is set* 
Moreover, in this case, the search range setting section 

25 reduces the number of candidates for the short lag 

section (pitch period), widely sets the range of a long 
lag and searches the lag in the range set by the search 
range setting section during the search in the adaptive 
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codebook . 

The speech encoder of this embodiment will be 
explained in detail below using the attached drawings. 
Here, suppose one frame is divided into two subframes. 
5 The same procedure can also be used for coding in the 
case of 3 subframes or more. 

In a pitch search according to a so-called delta 
lag coding system, this speech coder finds pitches of 
all subframes in the processing frame, determines the 

10 level of a correlation between pitches and determines 
the search range according to the correlation result. 

FIG. 7 is a block diagram showing a configuration 
of the speech encoder according to Embodiment 2 of the 
present invention. First, LPC analysis section 302 

15 performs an autocorrelation analysis and LPC analysis 
on speech data input (input speech) 301 entered and 
obtains LPC coefficients. Moreover, LPC analysis 
section 302 performs coding on the LPC coefficients 
obtained and obtains an LPC code. Furthermore, LPC 

20 analysis section 302 decodes the LPC code obtained and 
obtains decoded LPC coefficients. 

Then, pitch analysis section 310 performs pitch 
analysis for consecutive 2 subframe respectively, and 
obtains a pitch candidate and a parameter for each 

25 subframe. The pitch analysis algorithm for one subframe 
is shown below. Two correlation coefficients are 
obtained from expression 7 below. At this time, C pp is 
obtained about P mln first and remaining P min+1 and P min+2 can 
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be calculated efficiently by subtraction and addition 
of the values at the frame end. 

L 

= yxixXi-P (P=Pmin~Pmax) 



Cpp = y Xi -PxXi -P (P=Pmin~Pmax) 

i-0 

Expression 7 

where : 

XX if Xi_ p : Input speech 

V p : Autocorrelation function 

C pp : Power component 

i: Input speech sample number 

L: Subframe length 

P: Pitch 

P min ,P max : Minimum value and maximum value for pitch 
search 

Then, the autocorrelation function and power 
component calculated from expression 7 above are stored 
in memory and the following procedure is used to 
calculate typical pitch P^ This is the processing of 
calculating pitch P that corresponds to a maximum of V p 
Xv p /C pp while V p is positive. However, since a division 
calculation generally requires a greater amount of 
computational complexities, both the numerator and 
denominator are stored to convert the division to a 
multiplication toreduce the computational complexities. 

Here, a pitch is found in such a way that the sum 
of square of the input speech and the square of the 
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difference between the input speech and the adaptive 
excitation vector ahead of the input speech by the pitch 
becomes a minimum. This processing is equivalent to the 
processing of finding pitch P corresponding to a maximum 
5 of V p xv p /C pp . Specific processing is as follows: 

1) Initialization (P=P min , w=C=0, P^P^) 

2) If (V P X v p Xc<VVXc pp ) or (V p <0), then go to 4). 
Otherwise, go to 3). 

3) Supposing VV=V p Xv p/ C=C pp , P X =P, go to 4 ) . 
10 4) Suppose P=P+1. At this time, if P>P max , the 

process ends. Otherwise, go to 2). 

Perform the operation above for each of 2 subf rames 
to calculate typical pitches P 1 and P 2 , autocorrelation 
coefficients V lp and V 2p , power components C lpp and C 2pp 

15 (P rnin <P<Pmax) • 

Then, search range setting section 311 sets the 
search range of the lag in the adaptive codebook. First, 
a provisional pitch, which is the center of the search 
range is calculated. The provisional pitch is 

20 calculated using the typical pitch and parameter 
obtained by pitch analysis section 310. 

Provisional pitches Q 1 and Q 2 are calculated using 
the following procedure. In the following explanation, 
constant Th (more specifically, a value 6 or so is 

25 appropriate) as the lag range. Moreover, the 

correlation value obtained from expression 7 above is 
used . 

While P x is fixed, provisional pitch ( Q 2 ) with the 
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maximum correlation is found near P a (±Th) first. 

1) Initialization (p=P 1 -Th, C max =0 , Q^P^ Q 2 = ?i) 

2) If (V lpl xv lpl /C lplpl +V 2p xv 2p /C 2pp <C max ) or (V 2p <0) then 
go to 4). Otherwise, go to 3). 

5 3) Supposing C max =V lpl xV lpl /C lplpl +V 2p xV 2p /C 2pp , Q 2 =p , go 

to 4 ) . 

4 ) Supposing p=p+l , go to 2 ) • However, at this time, 
if p>P 1 +Th, go to 5). 

In this way, processing in 2) to 4) is performed 
10 from P-L-Th to P L +Th, the one with the maximum correlation, 
C max and provisional pitch Q 2 are found. 

Then, while P 2 is fixed, provisional pitch ( Q x ) near 
P 2 (±Th) with a maximum correlation is found. In this 
case, C max will not be initialized. By calculating Q x 
15 whose correlation becomes a maximum including C max when 
Q 2 is found, it is possible to find Q 1 and Q 2 with the 
maximum correlation between the first and second 
subf rames . 

5) initialization (p=P 2 -Th) 
20 6) If (V lp xv lp /C lpp +V 2p2 xv 2p2 /C 2p2p2 <C max ) or (V lp <0), go 

to 8). Otherwise, go to 7). 

7) Supposing 
C max =V lp xV lp /C lpp +V 2p2 xV 2p2 /C 2p2p2 ,Q 1 =p,Q 2 = P 2 , go to 8). 

8 ) Supposing p=p+l / go to 6 ) . However, at this time 
25 if p>P 2 +Th, go to 9). 

9 ) End 

In this way, perform processing in 6 ) to 8 ) from 
P 2 -Th to P 2 +Th, the one with the maximum correlation, C max 
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and provisional pitches Q 2 and Q 2 are found. Q x and Q 2 at 
this time are provisional pitches of the first and second 
sub frames , respectively . 

From the algorithm above, it is possible to select 
5 two provisional pitches with a relatively small 

difference in size (the maximum difference is Th) while 
evaluating the correlation between two subframes 
simultaneously. Using these provisional pitches 
prevents the coding performance from drastically 
10 deteriorating even if a small search range is set during 
a search of the second subf rame in the adaptive codebook. 
For example, when sound quality changes suddenly from 
the second subframe, if there is a strong correlation 
of the second subframe, using Q x that reflects the 
15 correlation of the second subframe can avoid the 
deterioration of the second subframe. 

Furthermore, search range setting section 311 sets 
the search range (L__ ST to L__ EN ) of the adaptive codebook 
using provisional pitch Q x obtained as expression 8 
20 below: 

First subframe 

L_ST=Q1 -5 (when L_ST<Lmin, L_ST=Lmin) 

L_EN=L__ST+2 0 (when L_ST>Lmax , L_ST=Lmax ) 
Second subframe 
25 L_ST=Tl-l 0 (when L_ST<Lmin, L_ST=Lmin) 

L_EN=L_ST+2 1 (when L_ST>Lmax , L__ST=Lmax) 

Expression 8 

where : 
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L_ ST : Minimum of search range 
L_ EN : Maximum of search range 
L min : Minimum value of lag (e.g., 20) 
L MX : Maximum value of lag (e.g., 143) 
5 T 1 z Adaptive codebook lag of first frame 

In the above setting, it is not necessary to narrow 
the search range for the first subframe. However, the 
present inventor, et al. have confirmed through 
experiments that the performance is improved by setting 
10 the vicinity of a value based on the pitch of the input 
speech as the search range and this embodiment uses an 
algorithm of searching by narrowing the search range to 
26 samples. 

On the other hand, for the second subframe, the 
15 search range is set to the vicinity of lag T 2 obtained 
by the first subframe. Therefore, it is possible to 
perform 5-bit coding on the adaptive codebook lag of the 
second subframe with a total of 32 entries. Furthermore, 
the present inventor, et al . have also confirmed this 
20 time through experiments that the performance is 

improved by setting fewer candidates with a short lag 
and more candidates with a long lag. However, as is 
apparent from the explanations heretofore, this 
embodiment does not use provisional pitch Q 2 . 
25 Here, the effects of this embodiment will be 

explained. In the vicinity of the provisional pitch of 
the first subframe obtained by search range setting 
section 311 , the provisional pitch of the second subframe 
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also exists (because it is restricted with constant Th) . 
Furthermore, since a search has been performed with the 
search range narrowed in the first subframe, the lag 
resultant from the search is not separated from the 
5 provisional pitch of the first subframe. 

Therefore, when the second subframe is searched, 
the search can be performed in the range close to the 
provisional pitch of the second subframe, and therefore 
it is possible to search lags appropriate for both the 

10 first and second frames. 

Suppose a example where the first subframe is a 
silent-speech and the second subframe is not a 
silent-speech. According to the conventional method, 
sound quality will deteriorate drastically if the second 

15 subframe pitch is no longer included in the search 

section by narrowing the search range. According to the 
method of this embodiment, a strong correlation of 
typical pitch P 2 is reflected in the analysis of the 
provisional pitch of the pitch analysis section. 

20 Therefore, the provisional pitch of the first subframe 
has a value close to P 2 . This makes it possible to 
determine the range close to the part at which the speech 
starts as the provisional pitch in the case of a search 
by a delta lag. That is, in the case of an adaptive 

25 codebook search of the second subframe, a value close 
to P 2 can be searched, and therefore it is possible to 
perform an adaptive codebook search of the second 
subframe by a delta lag even if speech starts at some 
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midpoint in the second subframe. 

Then, excitation vector generator 305 extracts the 
excitation vector sample (adaptive code vector or 
adaptive excitation vector) stored in adaptive codebook 
5 303 and the excitation vector sample (stochastic code 
vector or stochastic excitation vector) stored in 
stochastic codebook 304 and sends these excitation 
vector samples to perceptual weighted LPC synthesis 
filter 306. Furthermore, perceptual weighted LPC 

10 synthesis filter 306 performs filtering on the two 
excitation vectors obtained by excitation vector 
generator 305 using the decoded LPC coefficients 
obtained by LPC analysis section 302. 

Furthermore, gain calculation section 308 analyzes 

15 the relationship between the two synthesized speeches 
obtained by perceptual weighted LPC synthesis filter 306 
and the input speech and finds respective optimal values 
(optimal gains) of the two synthesized speeches. Gain 
calculation section 308 adds up the respective 

20 synthesized speeches with power adjusted with the 

optimal gain and obtains an overall synthesized speech. 
Then, gain calculation section 308 calculates coding 
distortion between the overall synthesized speech and 
the input speech. Furthermore, gain calculation 

25 section 308 calculates coding distortion between many 
synthesized speeches obtained by making function 
excitation vector generator 305 and perceptual weighted 
LPC synthesis filter 306 on all excitation vector samples 
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in adaptive codebook 303 and stochastic codebook 304 and 
the input speech, and finds the indexes of the excitation 
vector samples corresponding to the minimum of the 
resultant coding distortion. 
5 Then, gain calculation section 308 sends the 

indexes of the excitation vector samples obtained and 
the two excitation vectors corresponding to the indexes 
and the input speech to parameter coding section 309. 
Parameter coding section 309 obtains a gain code by 

10 performing gain coding and sends the gain code together 
with the LPC code and indexes of the excitation vector 
samples to the transmission path. 

Furthermore, parameter coding section 309 creates 
an actual excitation vector signal from the gain code 

15 and the two excitation vectors corresponding to the 

indexes of the excitation vector samples and stores the 
actual excitation vector signal in adaptive codebook 303 
and at the same time discards the old excitation vector 
sample . 

20 By the way, perceptual weighted LPC synthesis 

filter 306 uses a perceptual weighting filter using an 
LPC coefficients, high frequency enhancement filter and 
long-term prediction coefficient (obtained by 
performing a long-term predictive analysis of the input 

25 speech ) . 

Gain calculation section 308 above makes a 
comparison with the input speech about all possible 
excitation vectors in adaptive codebook 303 and all 
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possible stochastic codebook 304 obtained from 
excitation vector generator 305, but two excitation 
vectors (adaptive codebook 303 and stochastic codebook 
304) are searched in an openloop as described above in 
5 order to reduce the amount of computational complexity. 

Thus, the pitch search method in this embodiment 
performs pitch analyses of a plurality of subframes in 
the processing frame respectively before performing an 
adaptive codebook search of the first subframe, then 

10 calculates a correlation value and thereby can control 
correlation values of all subframes in the frame 
simultaneously . 

Then, the pitch search method in this embodiment 
calculates a correlation value of each subframe, finds 

15 a value most likely to be a pitch period (called a "typical 
pitch") in each subframe according to the size of the 
correlation value and sets the lag search range of a 
plurality of subframes based on the correlation value 
obtained from the pitch analysis and typical pitch. In 

20 the setting of this search range, the pitch search method 
in this embodiment obtains an appropriate provisional 
pitch (called a "provisional pitch") with a small 
difference, which will be the center of the search range, 
using the typical pitches of a plurality of subframes 

25 obtained from the pitch analyses and the correlation 
values . 

Furthermore, the pitch search method in this 
embodiment confines the lag search section to a specified 
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range before and after the provisional pitch obtained 
in the setting of the search range above, allowing an 
efficient search of the adaptive codebook. In that case, 
the pitch search method in this embodiment sets fewer 
5 candidates with a short lag part and a wider range with 
a long lag, making it possible to set an appropriate 
search range where satisfactory performance can be 
obtained. Furthermore, the pitch search method in this 
embodiment performs a lag search within the range set 

10 by the setting of the search range above during an 

adaptive codebook search, allowing coding capable of 
obtaining satisfactory decoded sound. 

Thus, according to this embodiment, the 
provisional pitch of the second subf rame also exists near 

15 the provisional pitch of the first subframe obtained by 
search range setting section 311 and the search range 
is narrowed in the first subframe, and therefore the lag 
resulting from the search does not get away from the 
provisional pitch. Therefore, during a search of the 

20 second subframe, it is possible to search around the 
provisional pitch of the second subframe allowing an 
appropriate lag search in the first and second subframes 
even in a non-stationary frame in the case where a speech 
starts from the last half of a frame, and thereby attain 

25 a special effect that has not been attained with 
conventional arts . 
(Embodiment 3) 

An initial CELP system uses a stochastic codebook 
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with entries of a plurality of types of random sequence 
as stochastic excitation vectors, that is, a stochastic 
codebook with a plurality of types of random sequence 
directly stored in memory. On the other hand, many low 
5 bit-rate CELP encoder /decoder have been developed in 
recent years, which include an algebraic codebook to 
generate stochastic excitation vectors containing a 
small number of non-zero elements whose amplitude is +1 
or -1 (the amplitude of elements other than the non- 
10 zero element is zero) in the stochastic codebook section. 

By the way, the algebraic codebook is disclosed in 
the "Fast CELP Coding based on Algebraic codes", J.Adoul 
et al, Proc. IEEE Int. Conf. Acoustics, Speech, Signal 
Processing, 1987, pp. 1957-1960 or "Comparison of Some 
15 Algebraic Structure for CELP Coding of Speech", J.Adoul 
et al, Proc. IEEE Int. Conf. Acoustics, Speech, Signal 
Processing, 1987, pp. 1953-1956, etc. 

The algebraic codebook disclosed in the above 
papers is a codebook having excellent features such as 
20 (1) ability to generate synthesized speech of high 

quality when applied to a CELP system with a bit rate 
of approximately 8 kb/s, (2) ability to search a 
stochastic with a small amount of computational 
complexity, and (3) elimination of the necessity of data 
25 ROM capacity to directly store stochastic excitation 
vectors . 

Then, CS-ACELP (bit rate: 8 kb/s) and ACELP (bit 
rate: 5.3 kb/s) characterized by using an algebraic 
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codebook as a stochastic codebook are recommended as 
G.729 and g723.1, respectively from the ITU-T in 1996 * 
By the way, detailed technologies of CS-ACELP are 
disclosed in "Design and Description of CS-ACELP:A Toll 
5 Quality 8 kb/s Speech Coder", Redwan Salami et al, IEEE 
trans, SPEECH AND AUDIO PROCESSING, vol. 6, no. 2, March 
1998, etc. 

The algebraic codebook is a codebook with the 
excellent features as described above. However, when 

10 the algebraic codebook is applied to the stochastic 

codebook of a CELPencoder /decoder , the target vector for 
stochastic codebook search is always encoded/decoded 
(vector quantization) with stochastic excitation 
vectors including a small number of non-zero elements, 

15 and thus the algebraic codebook has a problem that it 
is impossible to a express a target vector for 
stochastic codebook search in high fidelity. This 
problem becomes especially conspicuous when the 
processing frame corresponds to an unvoiced consonant 

20 segment or background noisesegment . 

This is because the target vector for stochastic 
codebook search often takes a complicated shape in an 
unvoiced consonant segment or background noisesegment. 
Furthermore, in the case where the algebraic codebook 

25 is applied to a CELP encoder /decoder whose bit rate is 
much lower than the order of 8 kb/s, the number of non-zero 
elements in the stochastic excitation vector is reduced, 
and therefore the above problem can become a bottleneck 
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even in a stationary voiced segment where the target 
vector for stochastic codebook search is likely to be 
a pulse-like shape. 

As one of methods for solving the above problem of 
5 the algebraic codebook, a method using a dispersed-pulse 
codebook is disclosed, which uses a vector obtained by 
convoluting a vector containing a small number of 
non-zero elements (elements other than non-zero elements 
have a zero value) output from the algebraic codebook 

10 and a fixed waveform called a "dispersion pattern" as 
the excitation vector of a synthesis filter* The 
dispersed-pulse codebook is disclosed in the Unexamined 
Japanese Patent Publication No.HEI 10-232696, "ACELP 
Coding with Disper sed-Pulse Codebook" (by Yasunaga, et 

15 al * , Collection of Preliminary Manuscripts of National 
Conference of Institute of Electronics, Information and 
Communication Engineers in Springtime 1997, D-14-11, 
p. 253 , 1997-03 ) and "A Low Bit Rate Speech Coding with 
Multi Dispersed Pulse based Codebook" (by Yasunaga, et 

20 al., Collected Papers of Research Lecture Conference of 
Acoustical Society of Japan in Autumn 1998, pp. 281-282, 
1998-10) , etc. 

Next, an outline of the dispersed-pulse codebook 
disclosed in the above papers will be explained using 

25 FIG. 8 and FIG. 9. FIG. 9 shows a further detailed example 
of the dispersed-pulse codebook in FIG. 8. 

In the dispersed-pulse codebook in FIG. 8 and FIG. 9, 
algebraic codebook 4011 is a codebook for generating a 
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pulse vector made up of a small number of non-zero 
elements (amplitude is +1 or -1). The CELP 
encoder /decoder described in the above papers uses a 
pulse vector (made up of a small number of non-zero 
5 elements ) , which is the output of algebraic codebook 4 011 , 
as the stochastic excitation vector. 

Dispersion pattern storage section 4012 stores at 
least one type of fixed waveform called a "dispersion 
pattern" for every channel. There can be two cases of 

10 dispersion patterns stored for every channel: one case 
where dispersion patterns differing from one channel to 
another are stored and the other case where a dispersion 
pattern of a same (common) shape for all channels is 
stored. The case where a common dispersion pattern is 

15 stored for all channels corresponds to simplification 
of the case where dispersion pattern differing from one 
channel to another are stored, and therefore the case 
where dispersion patterns differing from one channel to 
another are stored will be explained in the following 

20 explanations of the present description. 

Instead of directly outputting the output vector 
from algebraic codebook 4011 as a stochastic excitation 
vector, dispersed-pulse codebook 401 convolutes the 
vector output from algebraic codebook 4011 and 

25 dispersion patterns read from dispersion pattern storage 
section 4012 for every channel in pulse dispersing 
section 4013, adds up vectors resulting from the 
convolution calculations and uses the resulting vector 
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as the stochastic excitation vector. 

The CELP encoder /decoder disclosed in the above 
papers is characterized by using a dispersed-pulse 
codebook in a same configuration for the encoder and 
5 decoder (the number of channels in the algebraic codebook, 
the number of types and shape of dispersion patterns 
registered in the dispersion pattern storage section are 
common between the encoder and decoder). Moreover, the 
CELP encoder /decoder disclosed in the above papers aims 

10 at improving the quality of synthesized speech by 

efficiently setting the shapes and the number of types 
of dispersion patterns registered in dispersion pattern 
storage section 4012, and the method of selecting in the 
case where a plurality of types of dispersion patterns 

15 are registered. 

By the way, the explanation of the dispersed-pulse 
codebook here describes the case where an algebraic 
codebook that confines the amplitude of non-zero 
elements to +1 or -1 is used as the codebook for generating 

20 a pulse vector made up of a small number of non-zero 
elements. However, as the codebook for generating the 
relevant pulse vectors, it is also possible to use a 
multi-pulse codebook that does not confine the amplitude 
of non-zero elements or a regular pulse codebook, and 

25 in such cases, it is also possible to improve the quality 
of the synthesized speech by using a pulse vector 
convoluted with a dispersion pattern as the stochastic 
excitation vector . 
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It has been disclosed so far that it is possible 
to effectively improve the quality of a synthesized 
speech by registering dispersion patterns obtained by 
statistically training of shapes based on a huge number 
5 of target vectors for stochastic codebook search, 

dispersion patterns of random-like shapes to efficiently 
express the unvoiced consonant segments and noise-like 
segments, dispersion patterns of pulse-like shapes to 
efficiently express the stationary voiced segment, 

10 dispersion patterns of shapes such that the energy of 
pulse vectors output from the algebraic codebook (energy 
is concentrated on the positions of non-zero elements) 
is spread around, dispersion patterns selected from 
among several arbitrarily prepared dispersion pattern 

15 candidates so that a synthesized speech of high quality 
can be output by encoding and decoding a speech signal 
and repeating subjective (listening) evaluation tests 
of the synthesized speech or dispersion patterns created 
based on phonological knowledge, etc. at least one type 

20 per non-zero element (channel) in the excitation vector 
output from the algebraic codebook, convoluting the 
registered dispersion patterns and vectors generated by 
the algebraic codebook (made up of a small number of 
non-zero elements) for every channel, adding up the 

25 convolution results of respective channels and using the 
addition result as the stochastic excitation vector. 

Moreover, especially when dispersion pattern 
storage section 4012 registers dispersion patterns of 
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a plurality of types (two or more types) per channel, 
methods disclosed as the methods for selecting a 
plurality of these dispersion patterns include: a method 
of actually performing encoding and decoding on all 
5 combinations of the registered dispersion patterns 
and "closed-loop search" a dispersion pattern 
corresponding to a minimum of the resulting coding 
distortion and a method for "open-loop search " 
dispersion patterns using speech-like information which 

10 is already made clear when a stochastic codebook search 
is performed (the speech-like information here refers 
to, for example, voicing strength information judged 
using dynamic variation information of gain codes or 
comparison result between gain values and a preset 

15 threshold value or voicing strength information judged 
using dynamic variation of linear predictive codes). 

By the way, for simplicity of explanations, the 
following explanations will be confined to a 
dispersed-pulse codebook in FIG. 10 characterized in that 

20 dispersion pattern storage section 4012 in the 

dispersed-pulse codebook in FIG. 9 registers dispersion 
pattern of only one type per channel. 

Here, the following explanation will describe 
stochastic codebook search processing in the case where 

25 a dispersed-pulse codebook is applied to a CELP encoder 
in contrast to stochastic codebook search processing in 
the case where an algebraic codebook is applied to a 
CELPencoder. First, the codebook search processing 
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when an algebraic codebook is used for the stochastic 
codebook section will be explained. 

Suppose the number of non-zero elements in a vector 
output by the algebraic codebook is N (the number of 
5 channels of the algebraic codebook is N), a vector 
including only one non-zero element whose amplitude 
output per channel is +1 or -1 (the amplitude of elements 
other than non-zero elements is zero) is di (i: channel 

number: O^i^N-1) and the subframe length is L * 
10 Stochastic excitation vector ck with entry number k 
output by the algebraic codebook is expressed in 
expression 9 below: 

iV-l 

i=0 

Expression 9 

15 where: 

Ck: Stochastic excitation vector with entry number 
K according to algebraic codebook 

di: Non-zero element vector ( di = ±d(n - pi) , where pi: 
position of non-zero element) 
20 N: The number of channels of algebraic codebook (= 

The number of non-zero elements in stochastic excitation 
vector ) 

Then, by substituting expression 9 into expression 
10, expression 11 below is obtained: 

P* c A 

Expression 10 
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where : 

v t : Transposition vector of v (target vector for 
stochastic codebook search ) 

H t : Transposition matrix of H (impulse response 
5 matrix of the synthesis filter) 

ck: Stochastic excitation vector of entry number 

k 




N-l 

HQ**) 

Expression 11 

10 where: 

v: target vector for stochastic codebook search 
H: Impulse response convolution matrix of the 
synthesis filter 

di: Non-zero element vector ( di = ±6(n - pi) , where pi : 
15 position of non-zero element) 

N: The number of channels of algebraic codebook 
(= The number of non-zero elements in stochastic 
excitation vector ) 

x =v H 
20 M^H* H 

The processing to identify entry number k that 
maximizes expression 12 below obtained by arranging this 
expression 10 becomes stochastic codebook search 
25 processing. 
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Expression 12 

where, x t =v t H N M=H t H (v is atarget vector for 
stochastic codebook search) in expression 1 2 * Here , when 
5 the value of expression 12 about each entry number k is 
calculated, x t = v t H and M=H t H are calculated in the 
pre-processing stage and the calculation result is 
developed (stored) in memory. It is disclosed in the 
above papers, etc. and generally known that introducing 

10 this pre-processing makes it possible to drastically 
reduce the amount of computational complexity when 
expression 12 is calculated for every candidate entered 
as the stochastic excitation vector and as a result, 
suppress the total amount of computational complexity 

15 required for a stochastic codebook search to a small 
value . 

Next, the stochastic codebook search processing 
when the dispersed-pulse codebook is used for the 
stochastic codebook will be explained. 

20 Suppose the number of non-zero elements output from 

the algebraic codebook, which is a component of the 
dispersed-pulse codebook, is N (N: the number of channels 
of the algebraic codebook), a vector that includes only 
one non-zero element whose amplitude is +1 or -1 output 

25 for each channel (the amplitude of elements other than 
non-zero element is zero) is di (i: channel number: 0 



49 



^i^N-l), the dispersion patterns for channel number i 
stored in the dispersion pattern storage section is wi 
and the subframe length is L. Then, stochastic 
excitation vector ck of entry number k output from the 
dispersed-pulse codebook is given by expression 13 



where : 

Ck: Stochastic excitation vector of entry number 
k output from dispersed-pulse codebook 

Wi: dispersion pattern (wi) convolution matrix 
di: Non-zero element vector output by algebraic 

codebook section ( C^— ± 8 (n-pi), where Pi : position of 
non-zero element) 

N: The number of channels of algebraic codebook 
section 

Therefore, in this case, expression 14 below is 
obtained by substituting expression 13 into expression 
10 • 



below: 



N-l 



Ck= XWidi 



Expression 13 




Expression 14 



where : 
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V: target vector for stochastic codebook search 
H: Impulse response convolution matrix of 

synthesis filter 

Wi: Dispersion pattern (wi) convolution matrix 
di: Non-zero element vector output by typical 

codebook section 

(di= ±5(n-Pi) 9 where Pi : position of non-zero element) 
N: The number of channels of algebraic codebook (= 

the number of non-zero elements in stochastic excitation 

vector ) 



The processing of identifying entry number k of the 
stochastic excitation vector that maximizes expression 
15 below obtained by arranging this expression 14 is the 
stochastic codebook search processing when the 
dispersed-pulse codebook is used. 



where, in expression 15, x t =v t Hi (where Hl=HWi : 
Wi is the dispersion pattern convolution matrix) . When 
a value of expression 15 is calculated for each entry 
number k, it is possible to calculate Hi=HWi , x t =v t Hl and 
R=Hi t Hj as the pre-processing and record this in memory. 



Hi = HWi 



x^Hi 



R= HiHj 




Expression 15 
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calculate expression 15 for each candidate entered as 
a stochastic excitation vector becomes equal to the 
amount of computational complexity to calculate 
expression 12 when the algebraic codebook is used (it 
5 is obvious that expression 12 and expression 15 have the 
same form) and it is possible to perform a stochastic 
codebook search with a small amount of computational 
complexity even when the disper s ed-pulse codebook is 
used . 

10 The above technology shows the effects of using the 

dispersed-pulse codebook for the stochastic codebook 
section of the CELP encoder /decoder and shows that when 
used for the stochastic codebook section, the 
dispersed-pulse codebook makes it possible to perform 

15 a stochastic codebook search with the same method as that 
when the algebraic codebook is used for the stochastic 
codebook section. The difference between the amount of 
computational complexity required for a stochastic 
codebook search when the algebraic codebook is used for 

20 the stochastic codebook section and the amount of 
computational complexity required for a stochastic 
codebook search when the dispersed-pulse codebook is 
used for the stochastic codebook section corresponds to 
the difference between the amounts of computational 

25 complexity required for the pre-processing stage of 

expression 12 and expression 15, that is, the difference 
between the amounts of computational complexity 
required for pre-processing (x t = v t Hi» M=H t ff) and pre- 
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processing (Hi=HWi^ x t =v t Hi^ R-Hi t Hj). 

In general, with the CELPencoder/decoder, as the 
bit rate decreases, the number of bits assignable to the 
stochastic codebook section also tends tobe decreased. 
5 This tendency leads to a decrease in the number of 

non-zero elements when a stochastic excitation vector 
is formed in the case where the algebraic codebook and 
dispersed-pulse codebook are used for the stochastic 
codebook section. Therefore, as the bit rate of the CELP 

10 encoder /decoder decreases, the difference in the amount 
of computational complexity when the algebraic codebook 
is used and when the dispersed-pulse codebook is used 
decreases . However, when the bit rate is relatively high 
or when the amount of computational complexity needs to 

15 be reduced even if the bit rate is low, the increase in 
the amount of computational complexity in the pre- 
processing stage resulting from using the dispersed- 
pulse codebook is not negligible. 

This embodiment explains the case where in a 

20 CELP-based speech encoder and speech decoder and speech 
encoding/decoding system using a dispersed-pulse 
codebook for the stochastic codebook section, the 
decoding side obtains synthesized speech of high quality 
while suppressing to a low level the increase in the 

25 amount of computational complexity of the pre-processing 
section in the stochastic codebook search processing, 
which increases compared with the case where the 
algebraic codebook is used for the stochastic codebook 



53 



section. 

More specif ically, the technology according to 
this embodiment is intended to solve the problem above 
that may occur when the dispersed-pulse codebook is used 
5 for the stochastic codebook section of the 

CELPencoder /decoder , and is characterized by using 
adispersion pattern, which differs between the encoder 
and decoder. That is, this embodiment registers the 
above-described dispersion pattern in the dispersion 
10 pattern storage section on the speech decoder side and 
generates synthesized speech of higher quality using the 
dispersion pattern than using the algebraic codebook. 
On the other hand, the speech encoder registers a 
dispersion pattern, which is the simplified dispersion 
15 pattern to be registered in the dispersion pattern 

storage section of the decoder (e.g. , dispersion pattern 
selected at certain intervals or dispersion pattern 
truncated at a certain length) and performs a stochastic 
codebook search using the simplified dispersion pattern. 
20 When the dispersed-pulse codebook is used for the 

stochastic codebook section, this allows the coding side 
to suppress to a small level the amount of computational 
complexity at the time of a stochastic codebook search 
in the pre-processing stage, which increases compared 
25 to the case where the algebraic codebook is used for the 
stochastic codebook section and allows the decoding side 
to obtain a synthesized speech of high quality. 

Using different dispersion patterns for the 
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encoder and decoder means acquiring an dispersion 
pattern for the encoder by modifying the prepared 
spreading vector (for the decoder) while reserving the 
characteristic . 
5 Here, examples of the method for preparing a 

dispersion pattern for the decoder include the methods 
disclosed in the patent (Unexamined Japanese Patent 
Publication No.HEI 10-63300) applied for by the present 
inventor, et al., that is, a method for preparing a 

10 dispersion pattern by training of the statistic tendency 
of a huge number of target vectors for stochastic codebook 
search, a method for preparing a dispersion vector by 
repeating operations of encoding and decoding the actual 
target vector for stochastic codebook search and 

15 gradually modifying the decoded target vector in the 
direction in which the sum total of coding distortion 
generated is reduced, a method of designing based on 
phonological knowledge in order to achieve synthesized 
speech of high quality or a method of designing for the 

20 purpose of randomizing the high frequency phase 

component of the pulse excitation vector. All these 
contents are included here. 

All these dispersion patterns acquired in this way 
are characterized in that the amplitude of a sample close 

25 to the start sample of the dispersion pattern (forward 
sample) is relatively larger than the amplitude of a 
backward sample. Above all, the amplitude of the start 
sample is often the maximum of all samples in the 
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dispersion pattern (this is true in most cases). 

The following are examples of the specific method 
for acquiring a dispersion pattern for the encoder by 
modifying the dispersion pattern for the decoder while 
5 reserving the characteristic: 

1) Acquiring a dispersion pattern for the encoder 
by replacing the sample value of the dispersion pattern 
for the decoder with zero at appropriate intervals 

2) Acquiring a dispersion pattern for the encoder 
10 by truncating the dispersion pattern for the decoder of 

a certain length at an appropriate length 

3) Acquiring a dispersion pattern for the encoder 
by setting a threshold of amplitude beforehand and 
replacing a sample whose amplitude is smaller than a 

15 threshold set for the dispersion pattern for the decoder 
with zero 

4) Acquiring a dispersion pattern for the coder by 
storing a sample value of the dispersion pattern for the 
decoder of a certain length at appropriate intervals 

20 including the start sample and replacing other sample 
values with zero 

Here, even in the case where a few samples from the 
beginning of the dispersion pattern is used as in the 
case of the method in 1 ) above, for example, it is possible 
25 to acquire a new dispersion pattern for the encoder while 
reserving an outline (gross characteristic) of the 
dispers ion pattern . 

Furthermore, even in the case where a sample value 
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is replaced with zero at appropriate intervals as in the 
case of the method in 2 ) above, for example, it is possible 
to acquire a new dispersion pattern for the encoder while 
reserving an outline (gross characteristic) of the 
5 original dispersion pattern. Especially, the method in 
4) above includes a restriction that the amplitude of 
the start sample whose amplitude is often the largest 
should always be saved as is, and therefore it is possible 
to save an outline of the original spreading vector more 

1 0 reliably . 

Furthermore, even in the case where a sample whose 
amplitude is equal to or larger than a specific threshold 
value is saved as is and a sample whose amplitude is 
smaller than the specific threshold value is replaced 

15 with zero as the method in the case of 3) above, it is 
possible to acquire a dispersion pattern for the encoder 
while reserving an outline (gross characteristic) of the 
d is per s ion pattern . 

The speech encoder and speech decoder according to 

20 this embodiment will be explained in detail with 

reference to the attached drawings below. The CELP 
speech encoder (FIG. 11) and the CELP speech decoder 
(FIG. 12) described in the attached drawings are 
characterized by using the above dispersed-pulse 

25 codebook for the stochastic codebook section of the 
conventional CELP speech encoder and the CELP speech 
decoder. Therefore, in the following explanations, it 
is possible to read the parts described "the stochastic 



57 



codebook", "stochastic excitation vector" and 
"stochastic excitation vector gain" as "dispersed-pulse 
codebook", "dispersed-pulse excitation vector" and 
"dispersed-pulse excitation vector gain" , respectively . 
The stochastic codebook in the CELP speech encoder and 
the CELP speech decoder has the function of storing a 
noise codebook or fixed waveforms of a plurality of types, 
and therefore is sometimes also called a "fixed 
codebook" . 

In the CELP speech encoder in FIG. 11, linear 
predictive analysis section 501 performs a linear 
predictive analysis on the input speech and calculates 
a linear prediction coefficient first and then outputs 
the calculated linear prediction coefficient to linear 
prediction coefficient encoding section 502. Then, 
linear prediction coefficient encoding section 502 
performs encoding (vector quantization) on the linear 
prediction coefficient and outputs the quantization 
index (hereinafter referred to as "linear predictive 
code") obtained by vector quantization to code output 
section 513 and linear predictive code decoding section 
503 . 

Then, linear predictive code decoding section 503 
performs decoding (inverse-quantization) on the linear 
predictive code obtained by linear prediction 
coefficient encoding section 502 and outputs to 
synthesis filter 504. Synthesis filter 504 constitutes 
a synthesis filter having the all-pole model structure 
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based on the decoding linear predictive code obtained 
from linear predictive code decoding section 503. 

Then, vector adder 511 adds up a vector obtained 
by multiplying the adaptive excitation vector selected 
5 from adaptive codebook 506 by adaptive excitation vector 
gain 509 and a vector obtained by multiplying the 
stochastic excitation vector selected from 
dispersed-pulse codebook 507 by stochastic excitation 
vector gain 510 to generate an excitation vector. Then, 

10 distortion calculation section 505 calculates 

distortion between the output vector when synthesis 
filter 504 is excited by the excitation vector and the 
input speech according to expression 16 below and outputs 
distortion ER to code identification section 512. 

15 ER=\\u-(g a Hp + g c Hc)( 

Expression 16 

where : 

u: Input speech (vector) 

H: Impulse response matrix of synthesis filter 
20 p: Adaptive excitation vector 

C: Stochastic excitation vector 
g a : Adaptive excitation vector gain 
g c : Stochastic excitation vector gain 
In expression 16, u denotes an input speech vector 
25 inside the frame being processed, H denotes an impulse 
response matrix of synthesis filter, ga denotes an 
adaptive excitation vector gain, gc denotes a stochastic 
excitation vector gain, p denotes an adaptive excitation 
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vector and c denotes a stochastic excitation vector* 

Here, adaptive codebook 506 is a buffer (dynamic 
memory) that stores excitation vectors corresponding a 
several number of past frames and the adaptive excitation 
vector selected from adaptive codebook 506 above is used 
to express the periodic component in the linear 
predictive residual vector obtained by passing the input 
speech through the inverse-filter of the synthesis 
filter . 

On the other hand, the excitation vector selected 
from dispersed-pulse codebook 507 is used to express the 
non-periodic (the component obtained by removing 
periodic component (adaptive excitation vector 
component) from the linear predictive residual vector) 
newly added to the linear predictive residual vector in 
the frame actually being processed. 

Adaptive excitation vector gain multiplication 
section 509 and stochastic excitation vector gain 
multiplication section 510 have the function of 
multiplying the adaptive excitation vector selected from 
adaptive codebook 506 and stochastic excitation vector 
selected from dispersed-pulse codebook 507 by the 
adaptive excitation vector gain and stochastic 
excitation vector gain read from gain codebook 508 . Gain 
codebook 508 is a static memory that stores a plurality 
of types of sets of an adaptive excitation vector gain 
to be multiplied on the adaptive excitation vector and 
stochastic excitation vector gain to be multiplied on 
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the stochastic excitation vector. 

Code identification section 512 selects an optimal 
combination of indices of the three codebooks above 
(adaptive codebook, dispersed-pulse codebook, gain 
5 codebook) that minimizes distortion ER of expression 16 
calculated by distortion calculation section 505 . Then, 
distortion identification section 512 outputs the 
indices of their respective codebooks selected when the 
above distortion reaches a minimum to code output section 

10 513 as adaptive excitation vector code, stochastic 
excitation vector code and gain code, respectively. 

Finally, code output section 513 compiles the 
linear predictive code obtained from linear prediction 
coefficient encoding section 502 and the adaptive 

15 excitation vector code, stochastic excitation vector 
code and gain code identified by code identification 
section 512 into a code (bit information) that expresses 
the input speech inside the frame actually being 
processed and outputs this code to the decoder side. 

20 By the way, code identification section 512 

sometimes identifies an adaptive excitation vector code, 
stochastic excitation vector code and gain code on a 
"subframe" basis, where "subframe" is a subdivision of 
the processing frame. However, no distinction will be 

25 made between a frame and a subframe (will be commonly 
referred to as "frame") in the following explanations 
of the present description. 

Then, an outline of the CELP speech decoder will 
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be explained using FIG. 12. 

In the CELP decoder in FIG. 12, code input section 
601 receives a code (bit information to reconstruct a 
speech signal on a (sub) frame basis) identified and 
5 transmitted from the CELP speech encoder (FIG. 11) and 
de-multiplexes the received code into 4 types of code: 
a linear predictive code, adaptive excitation vector 
code, stochastic excitation vector code and gain code. 
Then, code input section 601 outputs the linear 

10 predictive code to linear prediction coefficient 

decoding section 602, the adaptive excitation vector 
code to adaptive codebook 603 , the stochastic excitation 
vector code to disper sed-pulse codebook 604 and the gain 
code to gain codebook 605. 

15 Then, linear prediction coefficient decoding 

section 602 decodes the linear predictive code input from 
code input section 601, obtains a decoded linear 
predictive coefficients and outputs this decoded linear 
predictive coefficients to synthesis filter 609. 

20 Synthesis filter 609 constructs a synthesis filter 

having the all-pole model structure based on the decoding 
linear predictive code obtained from linear predictive 
code decoding section 602. On the other hand, adaptive 
codebook 603 outputs an adaptive excitation vector 

25 corresponding to the adaptive excitation vector code 
input from code input section 601. Dispersed-pulse 
codebook 604 outputs a stochastic excitation vector 
corresponding to the stochastic excitation vector code 
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input from code input section 601. Gain codebook 605 
reads an adaptive excitation gain and stochastic 
excitation gain corresponding to the gain code input from 
code input section 601 and outputs these gains to 
5 adaptive excitation vector gain multiplication section 
606 and stochastic excitation vector gain mult iplication 
section 607, respectively. 

Then, adaptive excitation vector gain 
multiplication section 606 multiplies the adaptive 

10 excitation vector output from adaptive codebook 603 by 
the adaptive excitation vector gain output from gain 
codebook 605 and stochastic excitation vector gain 
multiplication section 607 multiplies the stochastic 
excitation vector output from dispersed-pulse codebook 

15 604 by the stochastic excitation vector gain output from 
gain codebook 605. Then, vector addition section 608 
adds up the respective output vectors of adaptive 
excitation vector gain multiplication section 606 and 
stochastic excitation vector gain multiplication 

20 section 607 to generate an excitation vector. Then, 
synthesis filter 609 is excited by this excitation vector 
and a synthesized speech of the received frame section 
is output. 

It is important to suppress distortion ER of 
25 expression 16 to a small value in order to obtain a 

synthesized speech of high quality in such a CELP-based 
speech encoder /speech decoder. To do this, it is 
desirable to identify the best combination of an adaptive 
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excitation vector code, stochastic excitation vector 
code and gain code in closed-loop fashion so that ER of 
expression 16 is minimized. However, since attempting 
to identify distortion ER of expression 16 in the 
5 closed-loop fashion leads to an excessively large amount 
of computational complexity, it is a general practice 
to identify the above 3 types of code in the open-loop 
fashion . 

More specifically, an adaptive codebook search is 

10 performed first* Here, the adaptive codebook search 
processing refers to processing of vector quantization 
of the periodic component in a predictive residual vector 
obtained by passing the input speech through the 
inverse-filter by the adaptive excitation vector output 

15 from the adaptive codebook that stores excitation 

vectors of the past several frames. Then, the adaptive 
codebook search processing identifies the entry number 
of the adaptive excitation vector having a periodic 
component close to the periodic component within the 

20 linear predictive residual vector as the adaptive 

excitation vector code. At the same time, the adaptive 
codebook search temporarily ascertains an ideal adaptive 
excitation vector gain. 

Then, a stochastic codebook search (corresponding 

25 to dispersed-pulse codebook search in this embodiment) 
is performed. The dispersed-pulse codebook search 
refers to processing of vector quantization of the linear 
predictive residual vector of the frame being processed 
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with the periodic component removed, that is, the 
component obtained by subtracting the adaptive 
excitation vector component from the linear predictive 
residual vector (hereinafter also referred to as "target 
5 vector for stochastic codebook search") using a 

plurality of stochastic excitation vector candidates 
generated from the dispersed-pulse codebook. Then, 
this dispersed-pulse codebook search processing 
identifies the entry number of the stochastic excitation 

10 vector that performs encoding of the target vector for 
stochastic codebook search with least distortion as the 
stochastic excitation vector code. At the same time, the 
dispersed-pulse codebook search temporarily ascertains 
an ideal stochastic excitation vector gain. 

15 Finally, a gain codebook search is performed. The 

gain codebook search is processing of encoding (vector 
quantization) on a vector made up of 2 elements of the 
ideal adaptive gain temporarily obtained during the 
adaptive codebook search and the ideal stochastic gain 

20 temporarily obtained during the dispersed-pulse 

codebook search so that distortion with respect to a gain 
candidate vector (vector candidate made up of 2 elements 
of the adaptive excitation vector gain candidate and 
stochastic excitation vector gain candidate) stored in 

25 the gain codebook reaches a minimum. Then, the entry 
number of the gain candidate vector selected here is 
output to the code output section as the gain code. 

Here, of the general code search processing above 
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in the CELP speech encoder, the dispersed-pulse codebook 
search processing (processing of identifying a 
stochastic excitation vector code after identifying an 
adaptive excitation vector code) will be explained in 
5 further detail below. 

As explained above, a linear predictive code and 
adaptive excitation vector code are already identified 
when a dispersed-pulse codebook search is performed in 
a general CELP encoder. Here, suppose an impulse 

10 response matrix of a synthesis filter made up of an 
already identified linear predictive code is H, an 
adaptive excitation vector corresponding to an adaptive 
excitation vector code is p and an ideal adaptive 
excitation vector gain (provisional value) determined 

15 simultaneously with the identification of the adaptive 
excitation vector code is ga . Then, distortion ER of 
expression 16 is modified into expression 17 below. 
ER k =\v~g c Hc k f 

Expression 17 

20 where: 

v: Target vector for stochastic codebook search 
(where, v = u-g a Hp ) 

g c : Stochastic excitation vector gain 
H: Impulse response matrix of a synthesis filter 
25 C k : Stochastic excitation vector (k : entry number) 

Here, vector v in expression 17 is the target vector 
for stochastic codebook search of expression 18 below 
using input speech signal u in the processing frame, 
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impulse response matrix H (determined) of the synthesis 
filter, adaptive excitation vector p (determined) and 
ideal adaptive excitation vector gain ga (provisional 
value) . 
5 v = u - g a Hp 

Expression 18 

where : 

u: Input speech (vector) 

g a : Adaptive excitation vector gain (provisional 
10 value) 

H: Impulse response matrix of a synthesis filter 
p: Stochastic excitation vector 

By the way, the stochastic excitation vector is 
15 expressed as "c" in expression 16, while the stochastic 
excitation vector is expressed as "ck" in expression 17. 
This is because expression 16 does not explicitly 
indicate the difference of the entry number (k) of the 
stochastic excitation vector, whereas expression 17 
20 explicitly indicates the entry number. Despite the 

difference in expression, both are the same in meaning. 

Therefore, the dispersed-pulse codebook search 
means the processing of determining entry number k of 
stochastic excitation vector ck that minimizes 
25 distortion ERk of expression 17. Moreover, when entry 
number k of stochastic excitation vector ck that 
minimizes distortion ERk of expression 17 is identified, 
stochastic excitation gain gc is assumed to be able to 
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take an arbitrary value. Therefore, the processing of 
determining the entry number that minimizes distortion 
of expression 17 can be replaced with the processing of 
identifying entry number k of stochastic excitation 
5 vector ck that maximizes Dk of expression 10 above. 
Then, the dispersed-pulse codebook search is 
carried out in 2 stages: distortion calculation section 
505 calculates Dk of expression 10 for every entry number 
k of stochastic excitation vector ck, outputs the value 

10 to code identification section 512 and code 

identification section 512 compares the values, large 
and small, in expression 10 for every entry number k, 
determines entry number k when the value reaches a 
maximum as the stochastic excitation vector code and 

15 outputs to code output section 513. 

The operations of the speech encoder and speech 
decoder according to this embodiment will be explained 
below. 

FIG.13A shows a configuration of dispersed-pulse 
20 codebook 507 in the speech encoder shown in FIG. 11 and 
FIG.13B shows a configuration of dispersed-pulse 
codebook 604 in the speech decoder shown in FIG. 12. The 
difference in configuration between dispersed-pulse 
codebook 507 shown in FIG.13A and dispersed-pulse 
25 codebook 604 shown in FIG.13B is the difference in the 
shape of dispersion patterns registered in the 
dispersion pattern storage section. 

In the case of the speech decoder in FIG.13B, 
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dispersion pattern storage section 4012 registers one 
type per channel of any one of (1) dispersion pattern 
of a shape resulting from statistical training of shapes 
of a huge number of target vectors for stochastic 
5 codebook search, contained in a target vector for 

stochastic codebook search, (2) dispersion pattern of 
a random-like shape to efficiently express unvoiced 
consonant segments and noise-like segments, (3) 
dispersion pattern of a pulse-like shape to efficiently 

10 express stationary voiced segments, (4) dispersion 

pattern of a shape that gives an effect of spreading 
around the energy (the energy is concentrated on the 
positions of non-zero elements) of an excitation vector 
output from the algebraic codebook, (5) dispersion 

15 pattern selected from among several arbitrarily prepared 
dispersion pattern candidates by repeating encoding and 
decoding of the speech signal and an subjective 
(listening) evaluation of the synthesized speech so that 
synthesized speech of high quality can be output and (6) 

20 dispersion pattern created based on phonological 
knowledge . 

On the other hand, dispersion pattern storage 
section 4012 in the speech encoder in FIG.13A registers 
dispersion patterns obtained by replacing dispersion 
25 patterns registered in dispersion pattern storage 

section 4012 in the speech decoder in FIG.13B with zero 
for every other sample. 

Then, the CELP speech encoder /speech decoder in the 
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above configuration encodes /decodes the speech signal 
using the same method as described above without being 
aware that different dispersion patterns are registered 
in the encoder and decoder. 
5 The encoder can reduce the amount of computational 

complexity of pre-processing during a stochastic 
codebook search when the dispersed-pulse codebook is 
used for the stochastic codebook section (can reduce by 
half the amount of computational complexity of H i =H t W i 

10 and X it =v t Hi), while the decoder can spread around the 
energy concentrated on the positions of non-zero 
elements by convoluted conventional dispersion patterns 
on pulse vectors, making it possible to improve the 
quality of a synthesized speech. 

15 As shown in FIG.13A and FIG.13B, this embodiment 

describes the case where the speech encoder uses 
dispersion patterns obtained by replacing dispersion 
patterns used by the speech decoder with zero every other 
sample. However, this embodiment is also directly 

20 applicable to a case where the speech encoder uses 

dispersion patterns obtained by replacing dispersion 
pattern elements used by the speech decoder with zero 
every N (N^l) samples, and it is possible to attain 
similar action in that case, too. 

25 Furthermore, this embodiment describes the case 

where the dispersion pattern storage section registers 
dispersion patterns of one type per channel, but the 
present invention is also applicable to a CELP speech 
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encoder /decoder that uses the dispersed-pulse codebook 
characterized by registering dispersion patterns of 2 
or more types per channel and selecting and using a 
dispersion pattern for the stochastic codebook section, 
5 and it is possible to attain similar actions and effects 
in that case, too. 

Furthermore, this embodiment describes the case 
where the dispersed-pulse codebook use an algebraic 
codebook that outputs a vector including 3 non-zero 
10 elements, but this embodiment is also applicable to a 
case where the vector output by the algebraic codebook 

section includes M (M^l) non-zero elements, and it is 
possible to attain similar actions and effects in that 
case, too. 

15 Furthermore, this embodiment describes the case 

where an algebraic codebook is used as the codebook for 
generating a pulse vector made up of a small number of 
non-zero elements, but this embodiment is also 
applicable to a case where other codebooks such as 

20 multi-pulse codebook or regular pulse codebook are used 
as the codebooks for generating the relevant pulse vector, 
and it is possible to attain similar actions and effects 
in that case, too. 

Then, FIG.14A shows a configuration of the 

25 dispersed-pulse codebook in the speech encoder in FIG. 11 
and FIG.14B shows a configuration of the dispersed-pulse 
codebook in the speech decoder in FIG. 12. 

The difference in configuration between the 
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dispersed-pulse codebook shown in FIG.14A and the 
dispersed-pulse codebook shown in FIG.14B is the 
difference in the length of dispersion patterns 
registered in the dispersion pattern storage section. 
5 In the case of the speech decoder in FIG.14B, dispersion 
pattern storage section 4012 registers one type per 
channel of any one of (1) dispersion pattern of a shape 
resulting from statistical training of shapes based on 
a huge number of target vectors for stochastic codebook 

10 search, (2) dispersion pattern of a random-like shape 
to efficiently express unvoiced consonant segments and 
noise-like segments, (3) dispersion pattern of a 
pulse-like shape to efficiently express stationary 
voiced segments, (4) dispersion pattern of a shape that 

15 gives an effect of spreading around the energy (the 
energy is concentrated on the positions of non-zero 
elements) of an excitation vector output from the 
algebraic codebook, ( 5 ) dispersion pattern selected from 
among several arbitrarily prepared dispersion pattern 

20 candidates by repeating encoding and decoding of the 
speech signal and sub ject ive ( lis tening ) evaluation of 
the synthesized speech so that synthesized speech of high 
quality can be output and (6) dispersion pattern created 
based on phonological knowledge. 

25 On the other hand, dispersion pattern storage 

section 4012 in the speech encoder in FIG.14A registers 
dispersion patterns obtained by truncating dispersion 
patterns registered in the dispersion pattern storage 
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section in the speech decoder in FIG. 14B at a half length. 
Then, the CELP speech encoder /speech decoder in the 
above configurations encodes /decodes the speech signal 
using the same method as described above without being 
5 aware that different dispersion patterns are registered 
in the encoder and decoder. 

The coder can reduce the amount of computational 
complexity of pre-processing during a stochastic 
codebook search when the dispersed-pulse codebook is 

10 used for the stochastic codebook section (can reduce by 
half the amount of computational complexities of H i =H t W i 
and X it =v t H i ) / while the decoder uses the same 
conventional dispersion patterns, making it possible to 
improve the quality of a synthesized speech. 

15 As shown in FIG.14A and FIG. 14B, this embodiment 

describes the case where the speech encoder uses 
dispersion patterns obtained by truncating dispersion 
patterns used by the speech decoder at a half length. 
However, when dispersion patterns used by the speech 

20 decoder are truncated at a shorter length N (N^ 1 ) , this 
embodiment provides an effect that it is possible to 
further reduce the amount of computational complexty of 
pre-processing during a stochastic codebook search. 
However, the case where dispersion patterns used by the 

25 speech encoder are truncated at a length of 1 corresponds 
to the speech encoder that uses no dispersion pattern 
(dispersion patterns are applied to the speech decoder) . 
Furthermore, this embodiment describes the case 
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where the dispersion pattern storage section registers 
dispersion patterns of one type per channel, but the 
present invention is also applicable to a speech 
encoder /decoder that uses the dispersed-pulse codebook 
5 characterized by registering dispersion patterns of 2 
or more types per channel and selecting and using a 
dispersion pattern for the stochastic codebook section, 
and it is possible to attain similar actions and effects 
in that case, too. 

10 Furthermore, this embodiment describes the case 

where the dispersed-pulse codebook uses an algebraic 
codebook that outputs a vector including 3 non-zero 
elements, but this embodiment is also applicable to a 
case where the vector output by the algebraic codebook 

15 section includes M (M^l) non-zero elements, and it is 
possible to attain similar actions and effects in that 
case, too. 

Furthermore, this embodiment describes the case 
where the speech encoder uses dispersion patterns 

20 obtained by truncating the dispersion patterns used by 
the speech decoder at a half length, but it is also 
possible for the speech encoder to truncate the 
dispersion patterns used by the speech decoder at a 
length of N (N^l) and further replace the truncated 

25 dispersion patterns with zero every M (M^ 1 ) samples , and 
it is possible to further reduce the amount of 
computational complexity for the stochastic codebook 
search . 
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Thus, according to this embodiment , the CELP-based 
speech encoder, decoder or speech encoding/decoding 
system using the dispersed-pulse codebook for the 
stochastic codebook section registers fixed waveforms 
5 frequently included in target vectors for stochastic 
codebook search acquired by statistical training 
asdispersion vectors, convolutes (reflects) these 
dispersion patterns on pulse vectors, and can thereby 
use stochastic excitation vectors, which is closer tothe 

10 actual target vectors for stochastic codebook search, 
providing advantageous effects such as allowing the 
decoding side to improve the quality of synthesized 
speech while allowing the encoding side to suppress the 
amount of computational complexity for the stochastic 

15 codebook search, which is sometimes problematic when the 
dispersed-pulse codebook is used for the stochastic 
codebook section, to a lower level than conventional 
arts . 

This embodiment can also attain similar actions and 
20 effects in the case where other codebooks such as 

multi-pulse codebook or regular pulse codebook, etc. are 
used as the codebooks for generating pulse vectors made 
up of a small number of non-zero elements. 

The speech encoding/decoding according to 
25 Embodiments 1 to 3 above are described as the speech 
encoder /speech decoder, but this speech 

encoding/decoding can also be implemented by software. 
For example, it is also possible to store a program of 
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speech encoding/decoding described above in ROM and 
implement encoding/decoding under the instructions from 
a CPU according to the program. It is further possible 
to store the program, adaptive codebook and stochastic 
5 codebook ( dispersed-pulse codebook) in a computer- 
readable recording medium, record the program, adaptive 
codebook and stochastic codebook (dispersed-pulse 
codebook) of this recording medium in RAM of the computer 
and implement encoding/decoding according to the program. 

10 In this case, it is also possible to attain similar 
actions and effects to those in Embodiments 1 to 3 above. 
Moreover, it is also possible to download the program 
in Embodiments 1 to 3 above through a communication 
terminal and allow this communication terminal to run 

15 the program. 

Embodiments 1 to 3 can be implemented individually 
or combined with one another. 

This application is based on the Japanese Patent 
Application No.HEI 11-235050 filed on August 23, 1999, 

20 the Japanese Patent Application No.HEI 11-236728 filed 
on August 24, 1999 and the Japanese Patent Application 
No.HEI 11-248363 filed on September 2, 1999, entire 
content of which is expressly incorporated by reference 
herein . 

25 

Industrial Applicability 

The present invention is applicable to a base 
station apparatus or communication terminal apparatus 
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in a digital communication system, 



Amended Claims 

1. A speech encoder comprising: 

LPC synthesizing means for obtaining a synthesized 
speech by filtering adaptive excitation vector and 
stochastic excitation vector stored in an adaptive 
codebook and stochastic codebook using an LPC 
coefficients obtained from an input speech; 

gain calculating means for calculating gains of 
said adaptive excitation vector and said stochastic 
excitation vector and searching code of the adaptive 
excitation vector and searching code of the stochastic 
excitation vector using coding distortion between said 
input speech and said synthesized speech obtained using 
said gains; and 

parameter coding means for performing predictive 
coding of gains using the adaptive excitation vector and 
stochastic excitation vector corresponding to the codes 
obtained, wherein said parameter coding means comprises 
prediction coefficient adjusting means for adjusting 
one or more prediction coefficients used for said 
predictive coding according to one or more states of a 
previous sub frame . 

2. The speech encoder according to claim l f wherein 
when one ore more states of aprevious subframes are an 
extremely large value or an extremely small value, said 
prediction coefficient adjusting means adjusts said 
prediction coefficients so as to reduce the influence 
thereof . 
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3. The speech encoder according to claim l f wherein 
said parameter coding means has a codebook including gain 
vectors of the adaptive excitation vectors, logarithmic 
gain vectors of the stochastic excitation vectors and 

5 coefficients for adjusting the prediction coefficient, 

4. The speech encoder according to claim 3, wherein 
in predicting coding when a product sum between states 
and prediction coefficients are calculated, prediction 
coefficient adjustment coefficients corresponding to 

10 the states are multiplied. 

5. The speech encoder according to claim 1, further 
comprising storing means for storing said adaptive 
excitation vector, said stochastic excitation vector and 
prediction coefficient adjustment coefficient in 

15 accordance with each state. 

6. (Amended) The speech encoder according to claim 
5, wherein when said adaptive excitation vector and said 
stochastic excitation vector stored in said storing 
means are updated, said prediction coefficient 

20 adjustment coefficients are also updated. 

7. A CELP-based speech encoder that performs 
encoding by decomposing one frame into a plurality of 
sub frames , compris ing : 

LPC synthesizing means for obtaining a synthesized 
25 speech by filtering adaptive excitation vector and 
stochastic excitation vector stored in an adaptive 
codebook and stochastic codebook using LPC coeff icienta 
obtained from an input speech; 
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gain calculating means for calculating gains of 
said adaptive excitation vector and said stochastic 
excitation vector; and 

parameter coding means for performing vector 
5 quantization of the adaptive excitation vector and 
stochastic excitation vector obtained using coding 
distortion between said input speech and said 
synthesized speech and said gains, and further 
comprising: 

10 pitch analyzing means for performing pitch 

analyses of a plurality of subframes in the frame 
respectively, before performing an adaptive codebook 
search for the first subframe, finding a correlation 
value and calculating a value most approximate to the 

15 pitch period using said correlation values. 

8. The speech encoder according to claim 7, further 
comprising search range setting means for determining 
a lag search range of a plurality of subframes based on 
the correlation value and the value most approximate to 

20 the pitch period obtained by said pitch analyzing means. 

9. The speech encoder according to claim 8, wherein 
said search range setting means determines a provisional 
pitch that becomes the center of the search range using 
the correlation values and the values most approximate 

25 to the pitch period obtained by said pitch analyzing 
means . 

10 . The speech encoder according to claim 9 , wherein 
the search range setting means sets a lag search section 
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in a specified range around the provisional pitch. 

11. The speech encoder according to claim 8, wherein 
the search range setting means sets a lag search section 
by reducing the number of candidates for shortpitch 

5 periods . 

12. The speech encoder according to claim 8, wherein 
the search range setting means performs a lag search 
within a set range during an adaptive codebook search. 

13. A computer-readable recording media storing a 
10 speech encoding program, an adaptive codebook storing 

past used for sy nthes is ing excitation vector signals and 
a stochastic codebook storing a plurality of stochastic 
excitation vectors, said speech encoding program 
comprising the steps of: 

15 obtaining a synthesized speech by filtering 

adaptive excitation vector and by filtering stochastic 
excitation vector stored in said adaptive codebook and 
in said stochastic codebook using decoded LPC 
coefficients obtained from an input speech; 

20 calculating gains of said adaptive excitation 

vector and said stochastic excitation vector; 

performing vector quantization on the adaptive 
excitation vector and stochastic excitation vector 
determined using coding distortion between said input 

25 speech and said synthesized speech, and said gains, 

wherein said vector quantization step further comprising 
the steps of: 

determining a quantization target vector based on 
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coding distortion between a plurality of quantization 
target vector and prediction coefficient used for 
predictive coding; and 

adjusting said prediction coefficients according 
5 to one or more states of on or more previous subframe. 

14. A computer-readable recording medium storing 
a speech encoding program, an adaptive codebook storing 
past used for synthesising excitation vector signal and 
a stochastic codebook storing a plurality of stochastic 
10 excitation vector, said speech coding program comprising 
the steps of: 

obtaining a synthesized speech by filtering 
adaptive excitation vector and stochastic excitation 
vector stored in said adaptive codebook and said 
15 stochastic codebook using decoded LPC coefficients 
obtained from an input speech; 

calculating gains of said adaptive excitation 
vector and said stochastic excitation vector; 

performing vector quantization on the adaptive 
20 excitation vector and stochastic excitation vector 

determined using coding distortion between said input 
speech and said synthesized speech; and 

calculating correlation values by performing 
pitch analysses of a plurality of subframes in the 
25 processing frame before performing an adaptive codebook 
search of the first subframe and calculating a value most 
approximate to the pitch period using said correlation 
values . 
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15* (Added) A speech encoder comprising a 
dispersed-pulse codebook that generates a vector by 
convoluting a vector containing one or more one non- 
zero elements (elements other than non-zero elements 
5 have values of zero) and a fixed waveform called a 
"dispersion pattern", wherein said dispersed-pulse 
codebook has a configuration different from that of the 
dispersed-pulse codebook on the speech decoder side. 

16. (Added) The speech encoder according to claim 
10 15 , wherein the dispersion pattern storage section, 

which is a component of the dispersed-pulse codebook, 
stores dispersion patterns different from those stored 
in the dispersion pattern storage section on the speech 
decoder side. 

15 17. (Added) The speech encoder according to claim 

16, wherein the dispersion pattern storage section 
stores dispersion patterns obtained by simplifying and 
selecting dispersion patterns stored in the dispersion 
pattern storage section on the speech decoder side. 

20 18. (Added) The speech encoder according to claim 

16, wherein the dispersion pattern storage section 
stores dispersion patterns obtained by replacing 
components of dispersion patterns stored in the 
dispersion pattern storage section on the speech decoder 

25 side with zero at certain intervals. 

19. (Added) The speech encoder according to claim 
16, wherein the dispersion pattern storage section 
stores dispersion patterns obtained by replacing 
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components of dispersion patterns stored in the 
dispersion pattern storage section on the speech decoder 
side with zero for every N samples { N : natural number). 

20, (Added) The speech encoder according to claim 
5 19, wherein the dispersion pattern storage section 
stores dispersion patterns obtained by replacing 
components of dispersion patterns stored in the 
dispersion pattern storage section on the speech decoder 
side with zero for every 1 sample. 

10 21. (Added) The speech encoder according to claim 

16, wherein the dispersion pattern storage section 
stores dispersion patterns obtained by truncating 
components of dispersion patterns stored in the 
dispersion pattern storage section on the speech decoder 

15 side at an appropriate length. 

22. (Added) The speech encoder according to claim 
16, wherein the dispersion pattern storage section 
stores dispersion patterns obtained by truncating 
components of dispersion patterns stored in the 

20 dispersion pattern storage section on the speech decoder 
side at a length of N samples (N: natural number). 

23. (Added) The speech encoder according to claim 
16, wherein the dispersion pattern storage section 
stores dispersion patterns obtained by truncating 

25 components of dispersion patterns stored in the 

dispersion pattern storage section on the speech decoder 
side at a half length. 

24. (Added) A speech decoder that decodes a speech 
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signal having a speech code generated by the speech 
encoder according to claim 15. 

25. (Added) A signal processing processor 
containing a software program that implements the speech 

5 encoder according to claim 15. 

26. (Added) A signal processing processor 
containing a software program that implements the speech 
decoder according to claim 24. 

27. (Added) A speech encoding/decoding system 

10 comprising a speech encoder and/or a speech decoder each 
having a dispersed-pulse codebook in configuration 
different from each other. 

28. (Added) The speech encoding/decoding system 
according to claim 27, wherein the difference in the 

15 configuration of the dispersed-pulse codebook between 
the speech encoder and the speech decoder lies in the 
shape of dispersion patterns stored in the respective 
dispersed-pulse codebooks . 

29. (Added) The speech encoding/decoding system 
20 according to claim 28, wherein the shapes of dispersion 

patterns of the speech encoder are obtained by 
simplifying the shape of dispersion patterns of the 
speech decoder. 

30. (Added) The speech encoding/decoding system 
25 according to claim 27, wherein the shapes of dispersion 

patterns of the speech encoder are obtained by replacing 
components of the dispersion patterns of the speech 
decoder with zero at appropriate intervals. 
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31. (Added) The speech encoding / decoding system 
according to claim 27, wherein the shapes of dispersion 
patterns of the speech encoder are obtained by replacing 
components of the dispersion patterns of the speech 

5 decoder with zero every N samples (N: natural number) . 

32. (Added) The speech encoding/decoding system 
according to claim 31 , wherein the shapes of dispersion 
patterns of the speech encoder are obtained by replacing 
components of the dispersion patterns of the speech 

10 decoder with zero every 1 sample. 

33. (Added) The speech encoding/decoding system 
according to claim 27, wherein the shapes of dispersion 
patterns of the speech encoder are obtained by truncating 
components of the dispersion patterns of the speech 

15 decoder at an appropriate length. 

34. (Added) The speech encoding/decoding system 
according to claim 27, wherein the shapes of dispersion 
patterns of the speech encoder are obtained by truncating 
components of the dispersion patterns of the speech 

20 decoder at a length of N samples (N: natural number). 

35. (Added) The speech encoding/decoding system 
according to claim 27, wherein the shapes of dispersion 
patterns of the speech encoder are obtained by truncating 
components of the dispersion patterns of the speech 

25 decoder at a half length. 

36. (Added) A communication base station equipped 
with the signal processing processor according to claim 
25 . 
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37. (Added) A communication terminal equipped with 
the signal processing processor according to claim 25. 

38. (Added) A radio communication system that 
connects the communication base station according to 
claim 36 with the communication terminal according to 
claim 37 via a radio network. 
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ABSTRACT 

A vector codebook 1094 storing a plurality of typical 
samples of quantization target vectors is created. Each 
vector consists of three elements, which are values 
5 corresponding to logarithmic values of an AC gain and 
SC gain and an adjustment coefficient of SC prediction 
coefficient. Prediction coefficient storage section 
1095 stores coefficients to perform predictive coding. 
These coefficients are MA prediction coefficients and 
10 a number of coefficients corresponding to the degree of 
prediction, of two types, AC and SC, are stored. 
Parameter calculation section 1091 calculates 
parameters necessary for distance calculations from the 
input perceptual weighted input speech, perceptual 
15 weighted LPC synthesis of adaptive code vector, 

perceptual weighted LPC synthesis of stochastic code 
vector, further decoded vectors (AC, SC, adjustment 
coefficient) stored in decoded vector storage section 
1096 and prediction coefficients (AC, SC) stored in 
20 prediction coefficient storage section 1095. 
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Declaration and Power of Attorney For Utility or Design Patent Application 



Japanese Language Declaration 



Xh'O, 



As a below named inventor, I hereby declare that: 



My residence, post office address and citizenship are as stated 
beiow next to my name. 

I believe I am the original, first and sole inventor (if only one name 
is listed beiow) or an original, first and joint inventor (if plural 
names are listed below ) of the subject matter which is claimed 
and for which a patent is sought on the invention entitled 



APPARATUS ANFI MffTHOn FOP <apF Ff" H COPING 



□ 



the specification of which is attached hereto unless the following 
box is checked: 

□ was filed on aq 



United States Application Number 
and was amended on 



(if applicable) or, 



mt^mmmmm^mi 19& (a-d) ^xnmses^iw m^.m^< , 



PCT International Application Number PCT/JP99/03064 
and was amended on __(jf applicable) 

I hereby state that I have reviewed and understand the contents of 
the above identified specification, including the claims, as 
amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to 
patentability as defined in Title 37, Code of Federal Regulations 
§1.56. 

I hereby claim foreign priority under Title 35, United States Code 
§119(a-d) or §365(b) of any foreign application(s) for patent or 
inventor's certificate, or §365(a) of any PCT international 
application which designated at least one country other than the 
United States, listed below. I have also identified below, by 
checking the "No" box, any foreign application for patent or 
inventor's certificate, or of any PCT international application having 
a filing date before that of the application on which priority is 
claimed: 



Prior foreign applications 

JP11-235050 JAPAN 

(Number) (Country) 

<#-*§-) m%) 

JP11-23R72R JAPAN 

(Number) (Country) 

(#-*§-) 

JP11-248363 JAPAN 

(Number) (Country) 

(#-s§0 (S£) 



23/August/1999 



Priority claimed 



(Day/Month/Year Filed) 

24/Angiist/1999 



(Day/Month/Year Filed) 

02/Sept ember/1 999 



(Day/Month/Year Filed) 



9> 



Yes 

m 



Yes 

m 



Yes 



□ 

No 

□ 
No 

□ 
No 



□ Additional foreign application numbers are listed on a 
supplemental priority sheet attached hereto. 
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Japanese Language Utility or Design Patent Application Declaration 



mt, £*@i£ft^35gp^n9* (e)il^^<, TfSco^@^#fPtH I hereby claim the benefit under Title 35, United States Code 

H<D$i]&£r^3IH~'5o §11 9(e) of any United States provisional application(s) listed 

below. 



(Application Number) (Day/Month/ Year Filed) 



(Application Number) (Day/Month/Year Filed) 



(Application Number) (Day/Month/Year Filed) 



□ ^(Di^^miKm^ihm^nmmmmm^mm^xmm^o D Additional provisional application numbers are listed on a 

supplemental priority sheet attached hereto. 



^365^ (c) mz.m~3<&%tm&ife£ ^±pcjmmihm^m^±mu * 
! ^t, %<d mm<o aim b t^mcomft&m b xitPCTW&mm b <d 



I hereby claim the benefit under Title 35, United States Code 
§120 of any United States application (s), or §365(c) of any PCT 
international application designating the United States, listed 
below and, insofar as the subject matter of each of the claims of 
this application is not disclosed in the prior United States or PCT 
international application in the manner provided by the first 
paragraph of Title 35, United States Code §112, I acknowledge 
the duty to disclose information which is material to patentability 
as defined in Title 37, Code of Federal Regulations §1.56 which 
became available between the filing date of the prior application 
and the national or PCT international filing date of this 
application. 



(Application No.) (Day/Month/Year Filed) (Status) 

(tHH#fO Wm<n*¥-Mft) (WfFS^^JS^^SfeSiS^) (patented, pending, abandoned) 



(Application No.) (Day/Month/Year Filed) (Status) 

Wm&W) (mm^MB) <#£W^^M*>&i^r#0 (patented, pending, abandoned) 



□ ^<D^^mxnmmww^m^mmm^mm^mm^xmm 



□ Additional U. S. or international application numbers are listed 
on a supplemental priority sheet attached hereto. 

I hereby declare that all statements made herein of my own 
knowledge are true and that all statements made on information 
and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the 
United States Code and that such willful false statements may 
jeopardize the validity of the application or any patent issued 
thereon. 

The undersigned hereby authorizes the U.S. attorney or agent 
named herein to accept and follow instructions from either his 
foreign patent agent or corporate representative, if any, as to any 
action to be taken in the Patent and Trademark Office regarding 
this application without direct communication between the U.S. 
attorney or agent and the undersigned. In the event of a change 
in the persons from whom instructions may be taken, the U.S. 
attorney or agent named herein will be so notified by the 
undersigned. 
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Japanese Language Utility or Design Patent Application Declaration 



JH§*-^ 7055 

mt£m&£tht^m±i±TmcDmyx3bz> 0 



Neil R Greenblum 
Bruce H. Bernstein 
James L. Rowland 
Arnold Turk 



POWER OF ATTORNEY: As a named inventor, I hereby 
appoint the attorney(s) and/or agent(s) associated with the 
Customer Number provided below to prosecute this 
application and transact all business in the Patent and 
Trademark Office connected therewith, and direct that ail 
correspondence be addressed to that Customer N umber: 

i^^CUSTOMER NUMBER 7055 



The appointed attorneys presently include: 

Reg. No. 28,394 
Reg. No. 29,027 
Reg. No. 32,674 
Reg. No. 33,094 



Address: GREENBLUM & BERNSTEIN, P.L.C. 

1941 ROLAND CLARKE PLACE 
RESTON, VA 20191 



Direct Telephone Calls to: (name and telephone number) 



GREENBLUM & BERNSTEIN, PX.C. 

(703)716-1191 



, Full name of sole or first inventor 
Kazutoshi YASUNAGA 



Inventor's signature Date 
esidertee Kawasaki-shi, Kapfaqawa, Japan 



mm 



Residei 



agawa^Japan 



Citizenship 
Japan 



Post Office Address 3-33-17-305, Sugao, Miyamae-ku, 

Kawasaki -shi, Kanagawa 21 6-001 5 Japan 



Full name of second joint inventor, if any 
Toshiyuki MORll 



Second Inventor's signature 



Date 



mm 



Residence & Kawasaki-shi, Kanagawa Japan 



Citizenship 
Japan 



Post Office Address 2-3-7-501 , Nijigaoka, Asao-ku, 

Kawasaki-shi, Kanagawa 215-0015 Japan 



m^fz^nm^^mmm^n^xhmmmm 



(Supply similar information and signature for third and 
subsequent joint inventors.) 
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